ArtifexSoftware / pdf2docx

Open source Python library for converting PDF to DOCX.
https://pdf2docx.readthedocs.io
GNU Affero General Public License v3.0
2.46k stars 356 forks source link

有计划支持 公式的转换吗 #263

Closed UchihaArk closed 2 weeks ago

UchihaArk commented 7 months ago

image

parksmallfish commented 7 months ago

+1

nunamia commented 6 months ago

nougat

JorjMcKie commented 2 weeks ago

No, we will never support this. The reason is that there is no way to extract formulas from a PDF page. Which of course would be the first prerequisite for any intention like this. Formulas can be anything: an image, vector graphics, text or any combination of all these things.

If a formula is given completely as an image or a vector graphic, then they will appear as an image on the resulting DOCX automatically. So this case is already covered.

Other formats are not detectable as a formula but mostly appear as text intermingled with some vector graphics.

So we can forget about this idea.