axa-group / Parsr

Transforms PDF, Documents and Images into Enriched Structured Data
Apache License 2.0
5.78k stars 309 forks source link

Add a small utility to fix broken pdf font #671

Closed hexapode closed 1 year ago

hexapode commented 1 year ago

When executing parsr on a PDF file that has broken PDF font, Parsr do not generate valid output.

Added a small utils tool that can be run prior to Parsr to fix font of pdf, enabling Parsr to work on PDF with broken fonts.