facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.98k stars 567 forks source link

Low amount of recognised pages #235

Closed ivholmlu closed 3 months ago

ivholmlu commented 3 months ago

ERROR:root:Extracting figures from file pdf_files/1.pdf failed. INFO:root:1: 1/14 pages recognized. Percentage: 7.14% 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.15it/s] INFO:root:In total: 1/14 pages recognized. Percentage: 7.14%

Trying to set up a dataset for finetuning. Having some struggles with only one page being recognized, not sure why. Anyone knows what might be the reason?