huridocs pdf-text-extraction issues

huridocs / pdf-text-extraction

This project aims to extract text from PDF files using the outputs generated by the pdf-document-layout-analysis service. By leveraging the segmentation and classification capabilities of the underlying analysis tool, this project automates the process of text extraction from PDF files.

Apache License 2.0

13 stars 0 forks source link

huridocs / pdf-text-extraction

issues

Can this extract text from image-only PDFs?