facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.81k stars 560 forks source link

Iterating over the pages in the generated mmd file in python [INFO] #186

Closed ari9dam closed 9 months ago

ari9dam commented 9 months ago

Very useful model and it performs quite well. What would be the recommended way to iterate over the the pages in the output file? I wanted to access content per page. The .mmd format is something very new to me. Any pointer would help. Thank you.

ari9dam commented 9 months ago

based on https://github.com/facebookresearch/nougat/blob/47c77d70727558b4a2025005491ecb26ee97f523/predict.py#L197 it seems this is not possible unless one makes a code change.

ari9dam commented 9 months ago

modified predict.py. closing it.