ralfstuckert / pdfbox-layout

MIT License
155 stars 74 forks source link

Not optimal ImageElement #47

Open andrewvsk opened 6 years ago

andrewvsk commented 6 years ago

A few points about images

  1. ImageElement based on BufferedImage that stores decoded (uncompressed) image bitmap. It could occupy more then 50MB per image. And doesn't matter if it is JPEG or PNG.
  2. To create PDImageXObject is used LosslessFactory. In this case compression not optimal at least for JPEG types.

Proposal:

  1. Update ImageElement to use image source inputs stream or byte array (compressed).
  2. To detect width and height just use ImageIO.getImageReaders(is)``, ``reader.getWidth(0), reader.getHeight(0)
  3. Then on draw call PDImageXObject.createFromByteArray() - new method in PDFBox 2.0.8

Note: createFromByteArray() - it also not optimal due to reading all stream (one time) instead of parse only header to detect color space. Anyway stream / byte array is compressed it is better than BufferedImage