elapouya / python-docx-template

Use a docx as a jinja2 template
GNU Lesser General Public License v2.1
1.91k stars 378 forks source link

docxtpl generates broken docx if some core_properties was changed #471

Open av-gantimurov opened 1 year ago

av-gantimurov commented 1 year ago

Describe the bug

docxtpl generates broken docx if some core_properties was changed before save. If open in Microsoft Office Word 2010-2019 - error occures. it is not error in python-docx - with it works and generates normal documents.

it works fine with docxtpl version 0.11.5, but in 0.12.0 and newer generated document is broken.

To Reproduce

installing python-docx and creating empty test document

python -m pip install python-docx
from docx import Document
doc = Document()
doc.core_properties.keywords = "keywords; works; fine"
doc.save("empty.docx")

empty.docx is opened fine and without fails or errors in Microsoft Office Word.

success

Installing properly working docxtpl 0.11.5

python -m pip install docxtpl==0.11.5
from docxtpl import DocxTemplate
tpl = DocxTemplate("empty.docx")
tpl.render({})
tpl.docx.core_properties.keywords = "keywords; still; works"
tpl.save("success.docx")

error

Bug occures in docxtpl from 0.12 to 0.16.4 and current development.

python -m pip install docxtpl
from docxtpl import DocxTemplate
tpl = DocxTemplate("empty.docx")
tpl.render({})
tpl.docx.core_properties.keywords = "keywords; still; works"
tpl.save("error.docx")

Problem occured if open in Microsoft Office Word 2010-2019. When open error.docx in Microsoft Office Word you see error message. When i try to show extra properties tab in file properties I don't see keywords were set. In libreoffice error.docx is opened without fails. Exiftools shows properly set keywords.

i compared empty.docx, success.docx and error.docx. they difference only in docProps/core.xml file.

Here i attach difference with beautified xml

  1. empty.docx
<cp:coreProperties
    xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dcmitype="http://purl.org/dc/dcmitype/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <dc:title/>
    <dc:subject/>
    <dc:creator>python-docx</dc:creator>
    <cp:keywords>success; fine</cp:keywords>
    <dc:description>generated by python-docx</dc:description>
    <cp:lastModifiedBy/>
    <cp:revision>1</cp:revision>
    <dcterms:created xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:created>
    <dcterms:modified xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:modified>
    <cp:category/>
</cp:coreProperties>
  1. success.docx
<cp:coreProperties
    xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dcmitype="http://purl.org/dc/dcmitype/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <dc:title/>
    <dc:subject/>
    <dc:creator>python-docx</dc:creator>
    <cp:keywords>error</cp:keywords>
    <dc:description>generated by python-docx</dc:description>
    <cp:lastModifiedBy/>
    <cp:revision>1</cp:revision>
    <dcterms:created xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:created>
    <dcterms:modified xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:modified>
    <cp:category/>
</cp:coreProperties>
  1. error.docx
<cp:coreProperties
    xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dcmitype="http://purl.org/dc/dcmitype/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <dc:title/>
    <dc:subject/>
    <dc:creator>python-docx</dc:creator>
    <cp:keywords>success; fine</cp:keywords>
    <dc:description>generated by python-docx</dc:description>
    <cp:lastModifiedBy/>
    <cp:revision>1</cp:revision>
    <dcterms:created xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:created>
    <dcterms:modified xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:modified>
    <cp:category/>
    <cp:keywords
        xmlns:cp="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties">error
    </cp:keywords>
</cp:coreProperties>

As you can see in error.docx <cp:keywords> dublicated and has additional xmlns:cp="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties

Expected behavior

Script above must create docx document without consistency errors.

Screenshots

Error in Microsoft Office 2010 image

Additional context

Python 3.10

abhishekjain12 commented 2 months ago

Same issue. Is there any fix?

2 custom properties make a difference:

if I remove these, then it works fine

Both properties has xmlns:cp="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties"

https://github.com/python-openxml/python-docx/issues/1037