vinaykanigicherla / pdftolatex

Python tool for generation of LaTex code from PDF files.
77 stars 23 forks source link

Some feedback #2

Open okaragoz opened 3 years ago

okaragoz commented 3 years ago

1

Thank you @vinaykanigicherla,

Unlike the listed packages, dependency requirements should be characterized in a different and more detailed way. For example, the tesseract package is required and requested for OCR and the language packs supported by this package must be installed separately. In addition, when the location of the image files in the localstore is transferred to the tex file, it uses the --filepath extension for example, /mnt/t/test/playground/localstore/x1assets/logo.JPG as well as if localstore is next to .tex file and exported with the ./ tag, fewer compile errors may be encountered. It is not a very essential problem, but it should be considered.

I could not see Unicode support for the exported tex code, especially the commonly used symbols such as ° and ± were not supported.

2Shot0ccurence commented 5 months ago

2

The software could be improved in a major way, in terms of recognition. Right now it just extracts everything it can see as plain text.