Content Controls are left out of the document (.docx)

everlof commented 8 years ago

Reproduce:

Start word (in my case Word 2013), select "Student report", insert values to the fields.
Save as .docx (Word document)
Convert it: pandoc -o test.html test.docx

The resulting html is empty because the document currently only contains Content Control fields.

I realize these kind of fields probably are out of scope for pandoc's extended markup, and shouldnt persist to html, but for a normal user looks strange when there are missing texts in the output.

Perhaps lookup the Content Control value and insert as text, so there is not risk of producing empty document as in my case.

larsw commented 6 years ago

Any status on this? We have some Word documents generated by another tool that heavily uses Content Controls to insert diagrams and text. It would be really useful to "flatten down" content controls and be able to extract image/text from them.

sztaylorakgov commented 6 years ago

Similar scenario converting Word to markdown. We have boilerplate Word templates where certain sections are supposed to be added or elaborated within content controls. This is an important capability in Word because content controls allow formatting, commenting, etc. The content in these sections is ignored in pandoc.

iredwards commented 6 years ago

jgm / pandoc

Content Controls are left out of the document (.docx) #2587