dmitryskachkov commented 2 years ago

Describe your problem

Hello! I created new docx template from json data. I need to use html in template variables, so I created subdoc. But result file have Differences in editors (Libre and Microsoft). Some text is missing. You can open file in Microsoft Word and see text, and in Libre (it's lost).

When I try to convert document to PDF (soofice cli) - text is missing. 17-3.docx

How can I save imported form HTML text ?

Parse body If it use html and presave to temp

desc_document = Document()
new_parser = HtmlToDocx()
new_parser.table_style = 'TableGrid'
new_parser.paragraph_style = 'Body Text'
new_parser.add_html_to_document(data['document']['body'], desc_document)
desc_result_path = "/home/support/domains/tmp/part_" + document_id + ".docx"
desc_document.save(desc_result_path)
sub_doc = doc.new_subdoc(desc_result_path)

context = {
    'document_date': time.strftime('%d.%m.%Y', time.gmtime(int(data['document']['created']))),
    'document_text': sub_doc,
    'document_creator': data['creator'],
    'creator_sign': data['creator_sign'],
    'document_sender': data['creator'],
    'document_sender_role': '',
    'document_creator_role': data['document_creator_role'],
}
if ('sign_list' in data):
    context['document_sender'] = data['signs_list'][-1]['user_full_name']
    context['document_signs'] = data['signs_list']

doc.render(context)

if not os.path.exists(base_path + hostname + "/preview/"):
    os.makedirs(base_path + hostname + "/preview/" + hostname)
doc.save(base_path + hostname + "/preview/" + hostname + "/" + document_id + ".docx")

`

elapouya commented 2 years ago

I do not understand what you want to do. Please provide a much more simple example that reproduce the problem.

dmitryskachkov commented 1 year ago

The file obtained as a result of the code operation opens incorrectly in different editors. You can check this by opening the file that I attached to microsoft word and libreoffice and you will see that data has disappeared in libreoffice.

The 'document_text' variable should contain html like "<p>text</p><h1>fooo</h1>'

Simple example

Json = {'document_text':'<p>text</p><h1>fooo</h1>', 'document_title':'Hello world'}

execute code

In result document.

View document in Microsoft World

Hello World text fooo

View document in LibbreOffice

Hello World

File example : 17-3.docx

dmitryskachkov commented 1 year ago

For test I removed html2doc part of code and leave only subdoc . So problem in part:

doc.new_subdoc("/path/to/document.docx") context = { 'document_text': sub_doc, } doc.render(context) doc.save("/path/tonewfile/document.docx")

elapouya commented 1 year ago

HTML rendering is not supported by docxtpl. You can only display not rendered html text with autoescape=True in render(). If you do not escape html, it will destroy document.xml inside the generated docx. The corruped docx may be interpreted differently with different editors.

ohplz commented 1 year ago

HTML rendering is not supported by docxtpl. You can only display not rendered html text with autoescape=True in render(). If you do not escape html, it will destroy document.xml inside the generated docx. The corruped docx may be interpreted differently with different editors.

can this be supported? or is there a workaround to this? after one afternoon, I can't found how to convert html to richtext, seems a tricky thing.

elapouya commented 1 year ago

Sorry, but there is no obvious way to render html via docxtpl

elapouya / python-docx-template

Differences in editors (Libre and Microsoft) #458

Describe your problem

More details about your problem

Parse body If it use html and presave to temp

Simple example