image-rs / image

Encoding and decoding images in Rust
Apache License 2.0
4.88k stars 606 forks source link

Large CPU and memory consumption on decoding a crafted GIF file #1052

Closed Shnatsel closed 8 months ago

Shnatsel commented 4 years ago

Tested on latest git commit as of this writing, ac93e755b51def00c93e7fadd33a5e5593afa1e2

Actual behavior

Decoding a crafted file mere 469 bytes in size takes over 7 gigabytes of memory. And not just virtual memory - it does get actually used and backed by physical memory.

This sample does not pose a problem as is, but it's possible that by applying the same technique it's possible to create very small files that OOM the system.

Expected

imagemagick rejects these files immediately. For the exact reason for each file see https://ncry.pt/p/6bPn#OPffqnjbN7RwGKJMwA0BQl3wuNlynKoUxw2dXZx9UKY

Reproduction steps

Same reproduction code as in #876

I'm attaching all samples classified as hangs by AFL so that you have the most complete info. The file id:000002,src:000000+000052,op:splice,rep:128 is the most egregious in terms of time and memory consumption.

image_gif_timeouts.zip

HeroicKatora commented 4 years ago

Woah! gif only has 16-bit sizes so this is definitely excessive.

HeroicKatora commented 4 years ago

Or just enough to be possible with 2^32 * 4 bytes (width×height×bpp). Either way, it's a good reminder to have better limits.

Shnatsel commented 4 years ago

For more on memory limits see #938

fintelia commented 4 years ago

In fact, the code from #876 does implement a memory limit. width * height < 2_000_000_000 at 4 bytes per pixel enforces a limit of about 7.45 GiB. I don't think this particular case is especially worrying, but we should have a more general solution.

I alluded to this in that other thread, but perhaps our new Reader API should have:

/// Set the limit on total memory use. Attempting to read a file which would take more 
/// space than this will fail with ImageError::InsufficientMemory. Defaults to 128MB, set to 
/// `usize::MAX` to disable.
Reader::with_memory_limit(bytes: usize);
/// Set the limit on image size. Defaults to 16,384 x 16,384.
Reader::with_size_limit(width: u32, height: u32);
HeroicKatora commented 4 years ago

Good catch, I had for some reason thought the limit was 2_000_000 pixels and there was a verification bug in the existing code. I'm glad it's just a little less bad.

Shnatsel commented 4 years ago

It's worth noting that the 2_000_000_000 bytes check is not actually there in the library. The code in #876 that does the check is just a fuzzing harness and is not present in the crate itself.

Shnatsel commented 8 months ago

I've updated the reproducing code to the latest version of image, and I can still reproduce the abnormally high memory consumption.

These files blow past the explicitly set size and allocation limits to use up over 7GB memory in image v0.24.7.

Code used:

extern crate image;

use image::AnimationDecoder;
use image::io::Limits;

use std::fs::File;
use std::io::Read;
use std::env;

fn main() {
    let filename = env::args().skip(1).next().expect("No file given");
    println!("Decoding {:?}", filename);
    let mut file = File::open(filename).unwrap();

    let mut contents: Vec<u8> = Vec::new();
    // Returns amount of bytes read and append the result to the buffer
    let _num_read = file.read_to_end(&mut contents).unwrap();
    gif_decode(&contents);
}

fn gif_decode(data: &[u8]) {
    let mut limits = Limits::default();
    limits.max_image_height = Some(16384); // 16384 * 16384 * 4 bytes per pixel = 1 GiB
    limits.max_image_width = Some(16384);
    limits.max_alloc = Some(1024 * 1024 * 1024); // 1 GiB
    if let Ok(decoder) = image::codecs::gif::GifDecoder::with_limits(data, limits) {

        let mut frames = decoder.into_frames();
        while let Some(Ok(_)) = frames.next() {
            // we don't use the frames for anything
        }
    }
}

Notable differences from the previous code:

  1. No longer uses .collect_frames(), reads frames one-by-one and discards the contents immediately
  2. Uses the built-in limiting facilities for both dimensions and memory usage

If I did the math right, both size and memory limiting are ineffective for these files - they both constrain memory usage to 1GiB, but I am seeing up to 7GiB memory usage in practice.

fintelia commented 8 months ago

If you call GifDecoder::set_limits that should at least return an error if the dimensions exceed what you set.

Really, we should deprecate GifDecoder::with_limits and for now have it internally call new followed by set_limits.

Shnatsel commented 8 months ago

Using .new() + a separate .set_limits() call does enforce the width/height limits!

These files still blow right past the allocation limit, but I'm not sure if that's even supported for gif.

Shnatsel commented 8 months ago

It might be useful to at least do a crude check against the memory limit: width * height * bit_depth <= limits.max_alloc and return an error if the image clearly exceeds that.

Shnatsel commented 8 months ago

Oh, there is native support for a memory limit in the gif crate: https://docs.rs/gif/latest/gif/struct.MemoryLimit.html

It's just not wired up to image for some reason.

Shnatsel commented 8 months ago

Hmm, I think the gif crate limits are not expressive enough. They only allow setting the limit of up to u32 bytes: https://docs.rs/gif/latest/src/gif/reader/mod.rs.html#32-63

But you can have a GIF image that decodes into up to width * height * bits_per_pixel which can be at most u16::MAX * u16::MAX * 4 which is 16GiB, while u32::MAX is 4GiB.

Shnatsel commented 8 months ago

I've opened a pull request for the gif crate to fix its memory limit implementation: https://github.com/image-rs/image-gif/pull/156

Once that's merged and shipped, I can pass through the memory limit set in image to the underlying gif decoder and finally fix this.

fintelia commented 8 months ago

I just checked thegif crate. It only checks the memory limit in read_next_frame which is never called from the image crate. It also only applies to the memory allocation used for the output buffer so any scratch buffers used during the decoding process aren't counted.

However, the general convention in the image crate is that decoders should not count output buffers towards the allocation limit because the API of ImageDecoder::read_image is specifically designed so that the allocation happens in the caller's code and not within the decoder. The higher-level io::Reader API then decrements the allocation limit by the output buffer size before decoding starts

Shnatsel commented 8 months ago

It is great that image implements a generic approach that counts the output buffer size towards the limit! But it does not appear to be working in this case.

If I take the above code and uncomment the limits on the dimensions, it creates a very large output buffer and consumes 7GiB again. I suspect that check is not performed in GifDecoder.into_frames().next(). Any tips on how to plumb it in there somewhere in a robust manner?

fintelia commented 8 months ago

The into_frames API is kind of a mess, so I'm not sure if there's clean way to do it. But you probably have to pass in either the limits or the error that the limits are exceeded as a field within the object that gets returned. And then when next is called, you'd return that error instead of decoding a frame.