internetarchive / archive-pdf-tools

Fast PDF generation and compression. Deals with millions of pages daily.
https://archive-pdf-tools.readthedocs.io/en/latest/
GNU Affero General Public License v3.0
97 stars 13 forks source link

Need some inspiration? #43

Open rmast opened 2 years ago

rmast commented 2 years ago

https://github.com/whitelok/image-text-localization-recognition https://github.com/qurator-spk/eynollah

MerlijnWajer commented 2 years ago

I have definitely seen eynollah before (it's not fast enough as it is to integrate with tesseract) -- the others not so much. I've been preoccupied with a few other projects at work, but in a few weeks I hope to get back to this project, implement supporting compressing existing PDFs (for use with OCRMyPDF), and support the ocr_photo elements.

rmast commented 2 years ago

https://tel.archives-ouvertes.fr/tel-01221308/document

rmast commented 2 years ago

https://www.math.uni-sb.de/service/preprints/preprint269.pdf

rmast commented 2 years ago

https://arxiv.org/pdf/1712.08232.pdf

rmast commented 2 years ago

https://www.researchgate.net/publication/334130136_Compressing_Flow_Fields_with_Edge-aware_Homogeneous_Diffusion_Inpainting

rmast commented 2 years ago

https://github.com/RenYurui/StructureFlow https://github.com/topics/image-inpainting?l=matlab&o=asc&s=stars https://github.com/topics/inpainting?o=asc&s=updated

rmast commented 2 years ago

Wow! https://github.com/Djdefrag/QualityScaler