To handle this properly it is needed to calculate size of decoded image data from parameters like width, height, bit per component, color space and decode using filters (note "EI " byte sequence in middle of image data, there can be any byte sequence). Unfortunately there is no required "Length" key which could be used to skip stream data like in normal pdf streams.
Also this affects other functionality of lopdf which depends on content decoding like text extraction. For example there can be false positive "Tj" inside image. Or in some circumstances could lopdf return error maybe when byte sequence in image data is not valid UTF-8 string and so on.
Although #356 handles the given file correctly, I'd like to keep this issue open until the most common filter types in inline images are handled as well.
Example pdf file: bi.pdf
Content stream contains:
There is chapter 4.8.6 about inline images in pdf reference.
To handle this properly it is needed to calculate size of decoded image data from parameters like width, height, bit per component, color space and decode using filters (note "EI " byte sequence in middle of image data, there can be any byte sequence). Unfortunately there is no required "Length" key which could be used to skip stream data like in normal pdf streams.
Also this affects other functionality of lopdf which depends on content decoding like text extraction. For example there can be false positive "Tj" inside image. Or in some circumstances could lopdf return error maybe when byte sequence in image data is not valid UTF-8 string and so on.