Closed Shnatsel closed 8 months ago
Woah! gif
only has 16-bit sizes so this is definitely excessive.
Or just enough to be possible with 2^32 * 4 bytes (width×height×bpp
). Either way, it's a good reminder to have better limits.
For more on memory limits see #938
In fact, the code from #876 does implement a memory limit. width * height < 2_000_000_000
at 4 bytes per pixel enforces a limit of about 7.45 GiB. I don't think this particular case is especially worrying, but we should have a more general solution.
I alluded to this in that other thread, but perhaps our new Reader API should have:
/// Set the limit on total memory use. Attempting to read a file which would take more
/// space than this will fail with ImageError::InsufficientMemory. Defaults to 128MB, set to
/// `usize::MAX` to disable.
Reader::with_memory_limit(bytes: usize);
/// Set the limit on image size. Defaults to 16,384 x 16,384.
Reader::with_size_limit(width: u32, height: u32);
Good catch, I had for some reason thought the limit was 2_000_000
pixels and there was a verification bug in the existing code. I'm glad it's just a little less bad.
It's worth noting that the 2_000_000_000 bytes check is not actually there in the library. The code in #876 that does the check is just a fuzzing harness and is not present in the crate itself.
I've updated the reproducing code to the latest version of image
, and I can still reproduce the abnormally high memory consumption.
These files blow past the explicitly set size and allocation limits to use up over 7GB memory in image
v0.24.7.
Code used:
extern crate image;
use image::AnimationDecoder;
use image::io::Limits;
use std::fs::File;
use std::io::Read;
use std::env;
fn main() {
let filename = env::args().skip(1).next().expect("No file given");
println!("Decoding {:?}", filename);
let mut file = File::open(filename).unwrap();
let mut contents: Vec<u8> = Vec::new();
// Returns amount of bytes read and append the result to the buffer
let _num_read = file.read_to_end(&mut contents).unwrap();
gif_decode(&contents);
}
fn gif_decode(data: &[u8]) {
let mut limits = Limits::default();
limits.max_image_height = Some(16384); // 16384 * 16384 * 4 bytes per pixel = 1 GiB
limits.max_image_width = Some(16384);
limits.max_alloc = Some(1024 * 1024 * 1024); // 1 GiB
if let Ok(decoder) = image::codecs::gif::GifDecoder::with_limits(data, limits) {
let mut frames = decoder.into_frames();
while let Some(Ok(_)) = frames.next() {
// we don't use the frames for anything
}
}
}
Notable differences from the previous code:
.collect_frames()
, reads frames one-by-one and discards the contents immediatelyIf I did the math right, both size and memory limiting are ineffective for these files - they both constrain memory usage to 1GiB, but I am seeing up to 7GiB memory usage in practice.
If you call GifDecoder::set_limits
that should at least return an error if the dimensions exceed what you set.
Really, we should deprecate GifDecoder::with_limits
and for now have it internally call new
followed by set_limits
.
Using .new()
+ a separate .set_limits()
call does enforce the width/height limits!
These files still blow right past the allocation limit, but I'm not sure if that's even supported for gif
.
It might be useful to at least do a crude check against the memory limit: width * height * bit_depth <= limits.max_alloc
and return an error if the image clearly exceeds that.
Oh, there is native support for a memory limit in the gif
crate: https://docs.rs/gif/latest/gif/struct.MemoryLimit.html
It's just not wired up to image
for some reason.
Hmm, I think the gif
crate limits are not expressive enough. They only allow setting the limit of up to u32
bytes: https://docs.rs/gif/latest/src/gif/reader/mod.rs.html#32-63
But you can have a GIF image that decodes into up to width * height * bits_per_pixel
which can be at most u16::MAX * u16::MAX * 4
which is 16GiB, while u32::MAX
is 4GiB.
I've opened a pull request for the gif
crate to fix its memory limit implementation: https://github.com/image-rs/image-gif/pull/156
Once that's merged and shipped, I can pass through the memory limit set in image
to the underlying gif
decoder and finally fix this.
I just checked thegif
crate. It only checks the memory limit in read_next_frame
which is never called from the image
crate. It also only applies to the memory allocation used for the output buffer so any scratch buffers used during the decoding process aren't counted.
However, the general convention in the image crate is that decoders should not count output buffers towards the allocation limit because the API of ImageDecoder::read_image
is specifically designed so that the allocation happens in the caller's code and not within the decoder. The higher-level io::Reader
API then decrements the allocation limit by the output buffer size before decoding starts
It is great that image
implements a generic approach that counts the output buffer size towards the limit! But it does not appear to be working in this case.
If I take the above code and uncomment the limits on the dimensions, it creates a very large output buffer and consumes 7GiB again. I suspect that check is not performed in GifDecoder.into_frames().next()
. Any tips on how to plumb it in there somewhere in a robust manner?
The into_frames
API is kind of a mess, so I'm not sure if there's clean way to do it. But you probably have to pass in either the limits or the error that the limits are exceeded as a field within the object that gets returned. And then when next
is called, you'd return that error instead of decoding a frame.
Tested on latest git commit as of this writing, ac93e755b51def00c93e7fadd33a5e5593afa1e2
Actual behavior
Decoding a crafted file mere 469 bytes in size takes over 7 gigabytes of memory. And not just virtual memory - it does get actually used and backed by physical memory.
This sample does not pose a problem as is, but it's possible that by applying the same technique it's possible to create very small files that OOM the system.
Expected
imagemagick rejects these files immediately. For the exact reason for each file see https://ncry.pt/p/6bPn#OPffqnjbN7RwGKJMwA0BQl3wuNlynKoUxw2dXZx9UKY
Reproduction steps
Same reproduction code as in #876
I'm attaching all samples classified as hangs by AFL so that you have the most complete info. The file
id:000002,src:000000+000052,op:splice,rep:128
is the most egregious in terms of time and memory consumption.image_gif_timeouts.zip