tomduck / pandoc-eqnos

A pandoc filter for numbering equations and equation references.
GNU General Public License v3.0
221 stars 27 forks source link

Problems with Word 2013 #16

Closed PeterDavidson closed 7 years ago

PeterDavidson commented 7 years ago

Thanks for the great filter - it's been very useful for pdf output. But I'm having problems when I try to output to word.

If I just use the simple example

$$ y = mx + b $$ {#eq:description}
See equation @eq:description

And build it with

pandoc --smart --filter pandoc-eqnos -o test.docx test.md

I get an error from MS Word 2013

We're sorry. We can't open test.docx because we found a problem with its contents.

Am I doing something wrong, or does it not work with this version of Word?

timtroendle commented 7 years ago

Similar issue for me using:

The error message is:

The Open XML file test.docx cannot be opened because there are problems with the contents or the file name might contain invalid characters (for example, \/).

(Frankly, the latter could be found by Word to not be true, but nevermind... )

awbirdsall commented 7 years ago

+1 on seeing similar issue, using same simple example as Peter and command

pandoc --filter pandoc-eqnos -o sample.docx sample.md

The Word message on trying to open is:

We're sorry. We can't open sample.docx because we found a problem with its contents

Details:
Unspecified error
Location: Part: /word/document.xml, Line: 2, Column: 653

I'm using:

awbirdsall commented 7 years ago

I dug into the xml of the docx output and it seems like there is something funny going on with the generated w:bookmarkStart element. The w:name does not have the eq: prefix stripped out, e.g., for the example earlier in the thread the generated element is

<w:bookmarkStart w:id="0" w:name="eq:description" />.

In contrast, pandoc-fignos does strip out fig: from the name, e.g.,

<w:bookmarkStart w:id="0" w:name="description" />.

I took a quick look at the pandoc-eqnos source, but I'm not very familiar with python pandoc filters (or pandoc attributes for that matter), and it's not immediately obvious to me why pandoc-fignos strips out fig: in this context but pandoc-eqnos does not strip out eq:.

EDIT: Apologies, this is incorrect. Both the pandoc-eqnos and pandoc-fignos docx output have a eq: or fig: prefix, respectively, in the w:name attribute within w:bookmarkStart, so this is not the issue.

awbirdsall commented 7 years ago

I did some more sleuthing and I think I have a fix!

Looking at a docx output file's xml, I saw that the <w:bookmarkStart /> element is doubly nested in two sets of paragraph (<w:p></w:p>) tags. Word doesn't seem to like this, and as a point of comparison, the equivalent <w:bookmarkStart />s generated by pandoc-fignos never seem to be doubly nested.

It seemed like the easiest fix was to remove the <w:p> and </w:p> added by pandoc-eqnos within process_equations (as part of bookmarkstart and bookmarkend). On my Windows machine, this seems to fix the problem for a minimal example, the test document in the test suite, and my own personal project.

I'm submitting a pull request with the change.

tomduck commented 7 years ago

Thank you all for your feedback. I have not had time for this project for a while, but am on it now. Stay tuned.

tomduck commented 7 years ago

I just pushed pandoc-eqnos 0.17 to pypi. Thanks you so much, @awbirdsall, for your sleuthing and this fix.

I'm assuming that this issue can be closed. Please re-open if problems persist.