ArtifexSoftware / pdf2docx

Open source Python library for converting PDF to DOCX.
https://pdf2docx.readthedocs.io
GNU Affero General Public License v3.0
2.46k stars 356 forks source link

Table is broken when the table is displayed on 2 pages #290

Closed pulse-mind closed 1 week ago

pulse-mind commented 3 months ago

Hello,

I am generating a PDF using wkhtml2pdf. In the PDF, the table is displayed on two pages because it is too big. When I convert this PDF into docx, the table is broken : A new table is added inside the row and it contains a part of the row.

Feuille-de-presence__-__Mode_de_compatibilité

I can share the PDF and the DOCX if necessary. Please let me know how to share...

Thanks a lot Fred

pulse-mind commented 3 months ago

And my be if I can ask a another question here : The space between lines and between paragraph are bigger than on the PDF. How can I manage that ?

greendreamer commented 1 week ago

Closing this for lack of reaction for an extended amount of time. Feel free to open a new issue - however please with a reproducing example.