Closed rushabh-wadkar closed 2 years ago
Well, since version 3, you can plug in any custom converter by subclassing BitmapConvBase
and passing it to
PdfPage.render_to()
or PdfDocument.render_to()
.
However, I am under the impression that PpmImagePlugin
is merely an opener/decoder and not a standalone public class.
That said, you could save the image to a buffer (or temporary file) in a compressed format and then re-open it with PIL.
I take it this is still related to https://github.com/pypdfium2-team/pypdfium2/issues/141?
According to my observation, PIL.Image.Image takes around ~500 MiB for a pdf while the same when converted to PIL.PPMImage it takes around ~300MiB. So just trying to optimise more!
I think, for what you want to do, you don't even need a custom converter. You could just save the image with PIL and reopen it, as I said:
import io
import PIL.Image
# assuming raw_image is the uncompressed image obtained via `render_topil()`
buf = io.BytesIO()
raw_image.save(buf, format="ppm")
raw_image.close()
buf.seek(0)
compressed_image = PIL.Image.open(buf)
# ...
buf.close() # once finished with compressed_image
However, I doubt if that provides any advantage at all because the raw image still has to be in memory for some time in any case.
pil_image = page.render_topil(scale=200/72)
The above render_topil returns PIL.Image.Image which in turn consumes high memory. Is it possible to render to PIL.PPMImage or custom PIL plugins ? https://pillow.readthedocs.io/en/stable/_modules/PIL/PpmImagePlugin.html