Closed gnott closed 4 years ago
Tested with pandoc
version 2.7.3
in the Docker image versus 2.7
and this bug does not appear in 2.7.3
.
There is, however, a newline character issue in the tests when using more recent versions of Pandoc. I think it will also be better to set --wrap=none
on the docker call when parsing the .docx
file and then adjust the test fixtures accordingly for to get the cleanest output.
After changes are applied to support newer pandoc
versions, the bug has re-appeared. I think it has something to do with the XML cleaning procedures in this library and a particular use case can be extracted from this example .docx
file.
Fixed in merging of PR https://github.com/elifesciences/decision-letter-parser/pull/70.
As a test for video content parsing, I tried generating XML output from the sample file
Chi 44816.docx
. There's an error inbuild.py
due to XML tagging,It looks like
pandoc
JATS output for the author response is adding extra<bold>
tags around paragraphs. I cannot find a quick way to fix it, even after editing the.docx
file content to see why it is producing the odd output.This may need to be checked again later. As far as I can tell, it is an issue with
pandoc
itself and the JATS output it produces.