axa-group / Parsr

Transforms PDF, Documents and Images into Enriched Structured Data
Apache License 2.0
5.76k stars 306 forks source link

Add page number to simple json output #598

Closed kleag closed 2 years ago

kleag commented 2 years ago

This adds page number to simple json outputs in headings, paragraphs, tables, etc. This allows an application handling text to show the original pdf page, possibly with annotations.