image-rs / image

Encoding and decoding images in Rust
Apache License 2.0
4.77k stars 590 forks source link

Unnecessary reallocation for JPEG decoding #2184

Open djc opened 3 months ago

djc commented 3 months ago

When we have an in-memory JPEG image (in a Vec<u8> or something else that implements AsRef<[u8]>) and we call image::load_from_memory(), the image crate uses Reader::make_decoder() to create a JpegDecoder which will then immediately create a Vec<u8>, read from the BufRead it uses internally to represent the image data and pass a reference to the Vec::as_slice() to the zune_jpeg::JpegDecoder. This of course allocates a full copy of the image in memory, incrementally (since read_to_end() does not seem to pass a size hint).

Expected

For in-memory images, the JpegDecoder should not copy all of the image data to a new allocation. To the extent that it does have to allocate, it should take reasonable steps up front to determine the required capacity.

Actual behaviour

JpegDecoder will incrementally allocate for a full copy of the in-memory image data.

nashaofu commented 3 months ago

Save as JPEG does not meet the expectations. The following is a duplication code.

use image;

fn main() {
    let im = image::open("input.png").unwrap();
    im.save("output.jpeg").unwrap();
}
fintelia commented 3 months ago

@nashaofu This is an issue thread for a specific concern about memory use in the JPEG decoder, while the issue you're reporting looks like it might be caused by #2173. Please feel free to create a new issue if the fix suggested in that thread doesn't work

fintelia commented 3 months ago

@djc This is currently necessary due to an API mismatch between image and zune-jpeg (the crate we use for JPEG decoding). In the image crate, our fancier decoders take a BufRead + Seek implementation, but zune-jpeg requires a ZReaderTrait impl that can essentially only be a &[u8], Vec<u8> or similar (specifically notice that the required methods take shared references so the type cannot have interior mutability).

Tagging @etemesi254 because I believe they are currently working on changing how zune-image handles stream reading

etemesi254 commented 3 months ago

Hi, I changed it to allow it to take anything that implements a different trait, we do have stream decoding now, but because of Rust's lack of specialization, I can't blanket it to allow a it to decode BufRead+Seek.

It's currently specialized on some common types such as Buffered files, std io cursor and in memory buffers, wrapped with (a ZCursor which is a std::io::Cursor reimpl that works in no-std environments).

This is a breaking change tho, but it makes it easy to add a separate impl that allows Bufread + Seek maybe something like

struct ZBufReadSeeker<T:BufRead + Seek>{
  source: T
}

impl<T:BufRead + Seek> ZByteReaderTrait for  ZBufReadSeeker<T>{

}
djc commented 3 months ago

@etemesi254 great to hear this is being addressed -- I think your solution makes sense. So I guess this is just a matter of waiting for these improvements to be released?

etemesi254 commented 3 months ago

The dev branch contains the changes already, still doing some benchmarks and fuzz testing to see the impact of the changes, then a 0.5 rc release is planned, (but want to see if i can get some perf changes before then hence why I've been delaying)

etemesi254 commented 3 months ago

Hi, hope I didn't keep you waiting for long, made a 0.5.0-rc0 release that should contain a fix for this.

The library can now read from anything that implements Bufread + Seek as long as the std feature is enabled. This is usually enabled by default so the only thing that changes is bumping up the dependency and fixing some breaking changes

fintelia commented 3 months ago

Created #2198 to try out the new release