Open dashingdove opened 1 year ago
You may have a look here: https://github.com/toxicphreAK/python-docx-ng I tried to implement this in https://github.com/toxicphreAK/python-docx-ng/pull/1 https://github.com/toxicphreAK/python-docx-ng/pull/8 and https://github.com/python-openxml/python-docx/pull/1196 Hopefully it may help you.
Yeah, tables in Word are complicated because they are so flexible. So you really need at least all prior rows as context to compute whether a particular cell is merged and so forth.
Add this to the characteristic of python-docx
that all document state is (necessarily) stored in the XML, and the fact we can't tell whether you've mutated the table between two Table.cell
calls, then you get this situation.
I think what's called for here is two alternatives (three if you count the current functionality), depending on whether you're reading or writing:
.save()
or maybe .sync()
method.I'm making this a feature request. Not sure when or if we'll get to it, but it would be a solid enhancement.
Table.cell performs very badly for large tables as it appears to rebuild the cell array every time you call it.
I found this to be an issue when using the htmldocx package. When adding a table, the cell function is called once for each cell which degrades performance significantly.
I have been able to circumvent this issue by getting _cells once and then referring to that array inside of the loop instead. However, this is a private property and it might be good to have an "official" way to do this without hacking around.