image-rs dramatically increases filesize of animated gifs that are loaded in and re-encoded

sam0x17 commented 2 years ago

This happens when encoding animated gifs. Some animated gifs become significantly larger (more than 50%) when they get loaded into image-rs, and then re-encoded and saved without making any other changes.

Expected

When loading an animated gif into image-rs and re-saving it, you should get roughly the same filesize or smaller.

Actual behaviour

Often times I end up with a much larger filesize (sometimes even 2x or 5x). My guess is there is no optimization of color palates going on or something like that resulting in poor compression. Just a guess.

Reproduction steps

This image starts at 641261 bytes and comes out as 2613166 bytes (400% increase in size).

large_animated

code:

fn main() {
    let f = File::open("large_animated.gif").unwrap();
    let decoder = GifDecoder::new(f).unwrap();
    let dest_w: u32 = 352; // hard coding width and height to that of the
    let dest_h: u32 = 240; // input image for demonstration purposes
    if let Ok(frames) = decoder.into_frames().collect_frames() {
        let mut resized_frames = Vec::new();
        for frame in &frames {
            let resized =
                imageops::resize(&frame.buffer().clone(), dest_w, dest_h, FilterType::CatmullRom);
            resized_frames.push(Frame::new(resized));
        }
        let gif_out = File::create("out.gif").unwrap();
        let mut encoder = GifEncoder::new(gif_out);
        encoder.set_repeat(image::gif::Repeat::Infinite).unwrap();
        encoder.encode_frames(resized_frames.into_iter()).unwrap();
    }
}

Impact

This is causing problems because we are using image-rs to process animated gifs in our platform's user-facing file uploader (for a service used by hundreds of thousands of users). Because we send images over SMS, there is a strict filesize limit and users are providing gifs that meet the requirements but then because image-rs increases the size significantly, these often fail to send.

sam0x17 commented 2 years ago

Note that we are resizing to the same size here just as a toy example. In reality we are usually downsizing a bit. Resizing to the same size should be a no-op.

HeroicKatora commented 2 years ago

I'm unsure how to turn this report into something actionable. Everything here is working as documented, decoding into a pixel matrix does not preserve format specific metadata.

Are you searching for guidance, in order to contribute engineering work effectively, given that you provided an affected business case you might even have a budget for such work?
Do you have a concrete idea for improving this roundtrip, if so, feel free to provide a sketch of implementation?
Are you unsure whether you missed a known workaround or better code structure to avoid the problem?

sam0x17 commented 2 years ago

Totally understand, let me clarify. The problem isn't that metadata isn't being preserved, it's that the compression image-rs is producing for animated gifs is dramatically worse than the compression exhibited by input images, across the board. I don't think this is a metadata thing, image-rs is clearly doing sometihng wrong or sub-optimal with how it is encoding animated gifs that results in a dramatically larger filesize. I thought that would be something you guys would like to fix?

If not, I'm happy to investigate this myself. My theory is it's something with color palates, but would appreciate any theories or guidance

HeroicKatora commented 2 years ago

The palette isn't the only metadata. If the input does DisposalMethod::Keep frames then we expand each of the frames to a full image but re-encoding doesn't try to find smaller delta encoded frames. Use gif optimizers to reduce file size in this case, or use gif directly for encoding with detection of appropriate regions. I'm not aware of a Rust library for detecting those regions, if you happen to modify code to recalculate it (see point on blending below) then please ping and we can figure out how to add it to the encoding process in image.
The palette is also metadata, and at encoding they are being re-calculated from scratch. Slightly confusing: when encoding the exact image (which has < 256 colors) then I had the impression that we chose those as the palette so the only difference might be the order of color. That does have an influence on size due but should be small due to lzw coding. Maybe we should improve test coverage to known more specifically.
However, when you resize with interpolation then surely a lot of newly blended colors are created. Then the imagequant kicks in to create a completely new palette. The delta region is also not being preserved.

HeroicKatora commented 2 years ago

Hm, might the size largely be caused by encoding the palette for each frame instead of utilizing a single global palette as in the original image?

Frame Sequence Before

``` Frame: delay: 8 canvas: 352x240+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 351x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x225+0+14 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x230+0+9 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 351x239+1+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x235+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 350x239+2+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 320x235+32+1 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 304x214+0+7 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 351x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 339x226+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 338x235+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 352x236+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 350x239+2+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 10 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 351x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x236+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 326x239+15+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 343x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 349x232+3+7 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 333x239+8+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 347x239+5+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 338x239+14+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 329x239+23+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x224+0+15 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 350x239+2+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 347x239+5+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 347x236+5+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 334x239+18+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 342x239+3+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x237+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 351x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 330x239+8+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 352x239+0+0 dispose: Keep needs_input: false palette: false Frame: delay: 8 canvas: 348x239+4+0 dispose: Keep needs_input: false palette: false ```

After

``` Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true Frame: delay: 0 canvas: 352x240+0+0 dispose: Background needs_input: false palette: true ```

That could be addressed much more directly.

sam0x17 commented 2 years ago

I'll poke around the code and see what I can figure out. For my purposes it sounds like the best approach might be to do k-means clustering on all the colors in all the frames and then come up with a single palette that fits the overall animation best and use that on all frames for minimal file size. For a general purpose approach I'd be curious what is typically done.

sam0x17 commented 2 years ago

Oh looks like gifski might do what I want!! https://github.com/ImageOptim/gifski

HeroicKatora commented 2 years ago

Hm, some additional investigation makes me reconsider if it is the local palette. I've reworked gif's encoder to create frames based on a persistent quantization map (PR incoming) however that barely decreased the size down to 2527K instead of the current 2552K. All individual lzw encoded chunks add up to the file size as well so it's not some confusion in the decoder with duplicate writes. The last potential issue is the chosen codesize: it could have been that the original file somehow managed to fit its pixels into 128 colors and coded 7-bit symbols but this turned out not to be the case. Both decoding and re-encoding uses 8-bit initial code size. Largely untested is if lzw encoding differs signficantly. I'll have to see how to run such tests in the first place but the added info stream (i.e. maximum coded lengths, symbol use, reset intervals) could be valuable additions to weezl anyways.

Shnatsel commented 3 days ago

I suspect what's happening here is that our decoder composites frames to make each one a full image, but then the encoder doesn't minify them back to store only the areas that changed between frames. So we end up storing information that's unchanged between frames over and over.

kornelski commented 2 days ago

It is perfectly normal for GIF that scaling an image down increases the file size.

This is because when you resize with an anti-aliasing filter, you're creating thousands of new colors. This requires a new palette. Different novel blends of colors across the frames typically cause every frame to have a new unique palette, even when all frames shared the same palette before resizing.

Remapping to a new palette will create a new dithering pattern, and noise from dithering generally doesn't compress well when using lossless LZW. It is extra noisy (and thus extra expensive to compress) when you dither previously dithered image (and all GIFs from video clips are dithered).

If you're not scaling down using exact box filter and integer ratio, then blurring between adjacent scaled-down pixels will likely turn previous simply repetitive patterns into new unique non-repeating patterns due to moire effect, and that kills GIF's compression scheme that is based solely on repetition. This is especially bad when compressing heavily dithered GIFs from single-pass ffmpeg. ffmpeg's fixed palette destroys a lot of information in GIFs making them compress a bit better, but resizing that messy dithering creates moire patterns that from LZW's perspective looks like new incompressible unique information. It still looks awful, but ironically costs in data even more than a high-quality original source would.

Also when you decode a GIF and create complete frames for resizing, you lose the information about partial updates (GIF can encode only a rectangular area that differed between frames, and can skip pixels that didn't change). There's no point preserving that as metadata, because resizing changes the bounds by adding anti-aliasing, new dithering, etc.

This library compresses GIF data reasonably well, but it just literally takes the data you feed it. It's really hard to avoid creating expensive-to-compress pixels when resizing a GIF, and it needs a lot of care to undo that damage. You need to be really smart about creating efficient palettes. You need special dithering to avoid adding new noise on top of previous dithering. You need to reduce differences between frames to pick minimal rectangles, but be careful not to create smudges behind moving objects or cut anti-aliased edges. I've implemented a ton of these things in gif.ski.

GIF is a super shitty format. It's so bad that it's normal that images with smaller dimensions and more blur take more space and compress worse.

sam0x17 commented 1 day ago

would nearest-neighbor scaling be better when downsizing in that case?

kornelski commented 20 hours ago

Yes, it would help.

image-rs / image