Open Pikamander2 opened 2 years ago
I did a quick test to see how the parser would handle different tags, but the results weren't great.
This code:
from docx import Document from htmldocx import HtmlToDocx document = Document() new_parser = HtmlToDocx() html = '<h1>Test file</h1><p>Test paragraph 1</p><p>Test paragraph 2</p><div>Test div 1</div><div><span>Test div 2</span></div>' new_parser.add_html_to_document(html, document) document.save('test1.docx')
Results in this document:
The p tags were converted properly, but the divs are being treated as inline text rather than as paragraphs.
I'm guessing that most other block level elements like <section> and <main> probably have the same issue as well.
<section>
<main>
I did a quick test to see how the parser would handle different tags, but the results weren't great.
This code:
Results in this document:
The p tags were converted properly, but the divs are being treated as inline text rather than as paragraphs.
I'm guessing that most other block level elements like
<section>
and<main>
probably have the same issue as well.