image-rs / jpeg-decoder

JPEG decoder written in Rust -- currently in maintenance mode
Apache License 2.0
148 stars 89 forks source link

Add the ability to scale down images during decode (IDCT scaling) #117

Closed kevinmehall closed 4 years ago

kevinmehall commented 4 years ago

When loading a large image to generate a small thumbnail, jpeg-decoder currently requires allocating a large buffer and decoding the full image. It's possible to use a smaller IDCT to directly decode a JPEG image at a fraction of the full size, saving decode time and memory. This adds IDCT implementations to handle 1/8, 1/4, and 1/2 size decoding.

The interface is a new Decoder constructor that accepts a requested image size. The implementation rounds this up to the nearest scale for which an IDCT implementation exists. This seems nicer than libjpeg's scaling API, which represents the size as a fraction and requires the user to know which scale factors are available.

lovasoa commented 4 years ago

👍 on this, this is a great feature to have ! Can someone review this ? @fintelia maybe ?

kevinmehall commented 4 years ago

Failing tests are due to the use of the use crate:: syntax not available in 1.28, but all PRs seem to be failing on 1.28 because of dependencies updating to the 2018 edition. I've opened https://github.com/image-rs/jpeg-decoder/pull/118 to increase the minimum Rust version to 1.34.

fintelia commented 4 years ago

I like the idea of exposing this functionality.

Would it be possible to control the scale factor via a separate modifier function on Decoder rather than making a new constructor? That way the user would be able to see the metadata of the image before deciding whether to try rescaling, and would also be compatible with adding other similar sorts of functionality without an exponential blowup in the number of constructors required.

I'm also not completely sold on taking minimum output dimensions as a argument. If the only possible downscale factors are 2, 4 and 8, why can't we just have the user pass one of those directly? Another point is that I'd expect users would often either want to decode to exactly the target size or to something substantially larger so that they can get a nicer downsampled result?

kevinmehall commented 4 years ago

Would it be possible to control the scale factor via a separate modifier function on Decoder rather than making a new constructor?

Sure, I'll try that. It will take some refactoring of parser.rs because right now it computes values while parsing the headers that would need to be recomputed after changing the scale factor.

If the only possible downscale factors are 2, 4 and 8, why can't we just have the user pass one of those directly?

It's possible to add other scale factors N/8 by adding a NxN IDCT. libjpeg supports all values of N between 1 and 16. It accepts a fraction N/M, but rounds to the next supported scale factor. Do you think that interface is preferable? Since the goal is usually a particular size thumbnail rather than a fraction of the original size, I imagine most users of that interface would implement the same math I added here to compute that scale factor, except they'd either have to hard-code the supported scale factors, or we'd have to add another API to list them. If a library user would like to perform more of the scaling with an external resampling algorithm, they can always double the requested size or similar.

lovasoa commented 4 years ago

I also think the interface that exposes Dimensions will both be easier to use and less likely to require a breaking change in the future.

fintelia commented 4 years ago

It's possible to add other scale factors N/8 by adding a NxN IDCT. libjpeg supports all values of N between 1 and 16.

This is the piece I was missing, your design is cleaner if a bunch more scale factors could be added.

One more question I have is the "rounding mode". The current strategy is to always round up to a large image size when picking between two scale factors. Other options would be round towards closest and round towards the original size (so decreasing the size rounds up, but increasing image size rounds down). The current choice seems reasonable to me, but I want to double check there isn't a reason we might later want to prefer a different strategy

kevinmehall commented 4 years ago

I chose to round up for the use case you mentioned previously: you want the IDCT to scale to larger than the desired size, then follow it with a resample to the final size.

If we were to later add a N>8 IDCT for upscaling, I think the desired behavior would be to round towards the original image size. That is, round up when downscaling, and round down when upscaling. Since this does not add upscaling support, the difference would only be how it is documented. (You could see the "or the full size of the image if the requested size is larger" as the degenerate case of this. Since there is no N>8 IDCT, a larger size always rounds down to 8/8 scaling).

kevinmehall commented 4 years ago

Updated to make the API be decoder.scale(requested_width, requested_height) instead of Decoder::scaled(requested_width, requested_height), with clearer documentation on scaling factor selection.

kevinmehall commented 4 years ago

@fintelia Is there anything else you'd like changed before merging? We're using this in production and have run about 100k photos through it so far.