ArtifexSoftware / pdf2docx

Open source Python library for converting PDF to DOCX.
https://pdf2docx.readthedocs.io
GNU Affero General Public License v3.0
2.56k stars 374 forks source link

tex2pdf2word #87

Open ghost opened 3 years ago

ghost commented 3 years ago

Pdf2docx can not convert \begin{cases} and \frac{}{} correctly.

dothinking commented 3 years ago

Do you mean math equation/formula/expression? If so, you're correct -> these are not typical text, pdf2docx doesn't parse equation for now. But would be a good idea to include this feature.

ghost commented 3 years ago

Yes, I generate pdf from tex. When I convert this pdf to word, math equation can not be translate correctly. I think if analysis math equation is difficult, we can just locate all equation and translate them to picture. Hope this make sense!

dothinking commented 3 years ago

Agree, that's also what in my mind. Hopefully it can be enhanced in two stages: