Open keestux opened 10 months ago
The example program
#!/usr/bin/env python3
from odfdo import Document, Paragraph
mydoc = Document('BugOdfdo.odt')
content = mydoc.get_part('content')
mydoc.save(target='BugOdfdo2.odt', pretty=True)
Here is the example input ODT BugOdfdo.odt
Maybe it is a bug in LibreOffice.
When I look at content.xml
there is nothing different, except for the white space (pretty print).
Hi, thanks for this interesting bug. Actually I'm not sure to remember of what should be the correct interpretation of the standard. But a few first thoughts:
Let's assume a simple sequence, read an ODT, write it to some other file.
The call to
get_part
shouldn't be doing much, but it is essential for the bug to show up. Probably reading anything from mydoc will trigger it.If the input ODT contains simple text paragraphs. However, some of the text is edited, delete a letter in a word, or add a letter in a word. The result is a
Paragraph
with multiplespan
s. Something likeThe input document shows:
The output document shows: