image-rs / image

Encoding and decoding images in Rust
Apache License 2.0
4.98k stars 618 forks source link

Library wide memory limits #938

Closed birktj closed 8 months ago

birktj commented 5 years ago

By now most decoder libraries have some sort of Limits struct or sanity checks that stops them from trying to decode images that would make them run out of memory. However this library doesn't have anything like it that could be used to configure these limits of the individual decoders. We should also use them to set a limit to the size of ImageBuffer and alike, with https://github.com/image-rs/image-png/pull/115 for example the png decoder will happily decode an image that is 100 000 x 100 000 pixels large as long as it doesn't use to much memory. In these cases we should then have a second check that we are not allocating a ImageBuffer of that size.

Shnatsel commented 5 years ago

TIFF now also implements memory limit based on the maximum buffer size, just like png crate: https://github.com/image-rs/image-tiff/blob/f6a91065d937fac56e2b70dae90da20e40fa46c1/src/decoder/mod.rs#L125-L136

HeroicKatora commented 5 years ago

And y4m also adopted one based on buffer size here (even though that's less relevant for images right now).

fintelia commented 5 years ago

There is also the ImageDecoder::total_bytes method, which is implemented for all decoders. That doesn't count intermediate allocations done while decoding an image, but those should generally be quite small. Perhaps that method could be used to have the new Reader struct take an optional memory limit on the size of images to load?

HeroicKatora commented 5 years ago

With the new io::Reader interface we could also support this in a semi-opaque way by adding a method to set a limit. This could have three possible internal states:

impl<R> Reader<R> {
    pub fn no_limits(&mut self) { }
    pub fn smart_limits(&mut self) { }
    pub fn set_limits(&mut self, _: Limits) { }
}
fintelia commented 5 years ago

I posted something very similar in the other thread. If we want to have limits on the decoder::dimensions or decoder::total_bytes then we wouldn't need any special buy in from implementations. I don't think any of them do large allocations at creation time, only when actually directed to decode the image so we can query metadata without risking OOMs.

Ideally the limits should be in terms of total memory used, but the amount of memory used for the output seems like a reasonably compromise for now: for reasonably sized images the allocations required are only at most 2x (or 3x?) the total output size.

birktj commented 4 years ago

I'm going to have a second try on this. I think I have found a reasonably extendable approach that should work nicely.

I propose to add the following struct:

impl Limits {
    /// Decode without any limits on resource usage.
    pub fn no_limits() -> Limits { }

    /// Set a relaxed limit on the number of bytes that may be allocated during decoding. This is a
    /// "best effort" limit which may be exceeded in some cases.
    pub fn set_max_bytes_relaxed(&mut self, max_bytes: u64) { }

    /// Set a strict limit on the number of bytes that may be allocated during decoding. If the
    /// decoder cannot guarantee to uphold this limit then decoding will fail.
    pub fn set_max_bytes_strict(&mut self, max_bytes: u64) { }

    // Easy to add other limits in the future.
}

And extend the Reader struct with the following methods, I think the default for reader should be some reasonabl default set of limits (currently relaxed).

impl<R> Reader<R> {
    /// Disable all limits.
    pub fn no_limits(&mut self) { }

    /// Set resource limits for decoding.
    pub fn set_limits(&mut self, limits: Limits) { }
}

And extend the ImageDecoder trait with the following method with a default implementation:

impl ImageDecoder {
    /// Set limits for this decoder.
    fn set_limits(&mut self, limits: Limits) -> ImageResult<()> {
        if limits.max_bytes.is_strict() {
            return Err(ImageError::Limits(error::LimitError::from_kind(
                error::LimitErrorKind::CannotGuarantee)))
        }

        if let Some(max_bytes) = limits.max_bytes.get() {
            if max_bytes < self.total_bytes() {
                return Err(ImageError::InsufficientMemory)
            }
        }

        Ok(())
    }
}
Shnatsel commented 4 years ago

I'm not convinced that decoding failing at runtime is a good idea for strict limits. I'd rather have the strict limits method not be implemented on those decoders.

Speaking of relaxed and strict limits, I think it makes sense to separately limit the maximum size of the final image (which is easy to enforce for all decoders) and the limit on the RAM allocated in addition to storing the final image, which some decoders might not support.

birktj commented 4 years ago

For it to work together with io::Reader it would have to be a runtime check, right?

The idea is that it should be easy to add different kinds of fine-grained limits in the future. However your idea with separating the total image size and other potential allocations sounds like a nice starting point. The maximum image size doesn't even need to have a strict/relaxed type bound.

Shnatsel commented 4 years ago

Yes, I suppose we would need a runtime check.

Total size of the resulting image is relatively easy to enforce, so I'd start with adding that to everything and then see if we need any limits that are more specialized. If we do, we can add them on a per-decoder basis.

Also, I wonder how this would interact with multi-frame images, such as animated GIF or APNG? I'm tempted to set a limit per frame, since it's easy to hand-roll total size limit if you have a frame size limit, but not vice versa.

fintelia commented 4 years ago

@birktj how does your design deal with adding additional strict limits later? Wouldn't decoders have to know about the change or else they'd silently ignore the new limits?

birktj commented 4 years ago

Looking over the tiff decoder I see that we also still have the problem of setting limits before metadata is read. There the entire ifd directory is read into memory in order to get metadata. This is problematic because is one of the things we probably would like more fine-grained limits for. (Of course this may simply be a problem with the implementation in the tiff crate that we can fix later).

This was also discussed in https://github.com/image-rs/image/pull/1088#discussion_r352300471

birktj commented 4 years ago

@fintelia I realize this would be a slight problem for external decoders. For internal decoders I thought the best would be to destructure the Limits trait to make sure all checks were accounted for, however this is obviously not a possibility for external decoders.

Maybe some sort of ack_limit and remanding_limits could work, however this feels a bit messy?

HeroicKatora commented 4 years ago

Looking over the tiff decoder I see that we also still have the problem of setting limits before metadata is read.

Instead of requiring limits before reading, could we avoid buffering the whole ifd to memory? Maybe we could skip all tags that are not relevant to the methods required by ImageDecoder and the re-read the image by seeking to the start.

birktj commented 4 years ago

Yeah, I believe we can probably just consider this a quirk in the tiff crate that could be fixed over there at some point.

birktj commented 4 years ago

Actually on closer inspection it doesn't look so bad, it seems the hashmap only contains pointers to the real data and reading of potentially large datatypes like strings is delayed until they are needed. This should mean that setting limits after the decoder is created should be fine.

Shnatsel commented 4 years ago

Unless I can create usize::MAX hashtable entries or some such. Not sure about TIFF, but Vorbis totally lets me do that.

HeroicKatora commented 4 years ago

The number of baseline tags is finite and iirc they can only appear once. Maybe we should even use tinyvec+bitset or so to read them there instead of a HashMap in any case.

fintelia commented 4 years ago

Also TIFF tags are only 16-bits, so even if the image contained every possible tag (most of which aren't even defined...) the hashtable would only take a few megabytes.