nlmatics / nlm-ingestor

This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.
https://www.nlmatics.com
Apache License 2.0
1.11k stars 160 forks source link

Can you provide guidance on when page_idx wouldn't be available? #35

Open chrismaresca opened 8 months ago

chrismaresca commented 8 months ago

Will page_idx ever not be available or is it always 0 indexed and enumerated? Or if the uploaded pdf starts at page 46 for example, will it start at 46? Or always 0? Thanks