facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.98k stars 567 forks source link

pypdfium2 in rasterize() causing memory leak? #178

Open sidharthrajaram opened 11 months ago

sidharthrajaram commented 11 months ago

pypdfium2 seems to be causing issues, specifically a memory leak: https://github.com/facebookresearch/nougat/issues/162#issuecomment-1786217285

To that end, I was curious why pdf2image usage was replaced with pypdfium2 in rasterize()?

lukas-blecher commented 11 months ago

The issue was the poppler requirement which is hard to install. I'll merge the related PR (#173) as soon as I can.

Here is a related thread: https://github.com/facebookresearch/nougat/issues/96#issuecomment-1726722085