Convert PDF pages separately

janacht commented 1 year ago

For large PDF files, convert runs into memory issues, giving error message such as

convert-im6.q16: cache resources exhausted `talk.pdf' @ error/cache.c/OpenPixelCache/4083.
convert-im6.q16: cache resources exhausted `slide.png' @ error/cache.c/OpenPixelCache/4083.
convert-im6.q16: memory allocation failed `slide.png' @ error/png.c/WriteOnePNGImage/9108.
convert-im6.q16: No IDATs written into file `slide-0.png' @ error/png.c/MagickPNGErrorHandler/1641.

This pull request therefore proposes to do the conversion page-by-page, avoiding these issues.

ashafaei commented 1 year ago

Thanks for the contribution @janacht -- that seems to be very useful.

Is it possible to add a check to ensure pdfinfo exists prior to using it, and if not, fall back to the old implementation?

janacht commented 1 year ago

Good point, good idea to reduce the number of required external packages. I've updated the pull request to use identify from ImageMagick instead of pdfinfo. Also, I've updated the return code handling for identify and the looped convert.

ashafaei commented 1 year ago

Thanks for the updated PR. Merged. Cheers.

ashafaei / pdf2pptx

Convert PDF pages separately #19