Open Amritpal2001 opened 2 years ago
Same here
same here, any solution?
I could be wrong, but what I have found is when I try to convert HTML(table with empty/blank cell) to Docx when the error occurs.
I can confirm this error occurs if you put a <br>
tag right at the start of a <td>
. If you put anything before the <br>
then it seems to work fine. For example:
<td><br>Hello world</td>
throws an error
<td>Hello world<br></td>
does not
I would guess that the run needs to be initialised somewhere. If some content precedes the <br>
then the run has already been created by the time the <br>
is parsed, but when the <br>
is the first child of the <td>
then the run attribute is missing which causes the error.
The error also occurs when adding a <br>
to the start of a document.
document = docx.Document()
html_parser = htmldocx.HtmlToDocx()
html_parser.add_html_to_document('<br>', document) #AttributeError
Basically, if the first thing that the parser sees is a <br>
then it throws an error. In the table cell example, a child parser gets created to parse the contents of the cell so it's exactly the same issue.
+1 on this issue
steps to replicate:
from docx import Document
from htmldocx import HtmlToDocx
document = Document()
new_parser = HtmlToDocx()
html = '<table><tr><td><br>testing</td></tr></table>'
new_parser.add_html_to_document(html, document)
+1
Hey, I am getting this issue sometimes while converting from HTML to docs.
Thanks!