JuliaIO / JpegTurbo.jl

Julia interface to libjpeg-turbo
MIT License
15 stars 7 forks source link

Reading image sizes lazily #17

Closed lorenzoh closed 2 years ago

lorenzoh commented 2 years ago

Just read that scale_ratio can be passed when decoding, which sounds awesome! Is it also somehow possible (maybe even with an external package) to read the size (h, w) of an image without loading first.

My use case is a high-throughput scenario, loading images for a deep learning pipeline. Often many of these images are saved in a much larger, but variable, resolution and are scaled and cropped to the same size (e.g. (256, 256)) before being batched and run through the model. It would be super useful to know the image size before decoding the image, so that I can select the smallest possible scale_ratio such that we still have (h>256 ,w>256). Still need to benchmark how this compares in load speed to just loading each image, downscaling and saving it once. Even if it only offers some load speed improvements, avoiding the extra resizing preprocessing step could also help reduce quality degradation, though.

johnnychen94 commented 2 years ago

Absolutely, we need an interface to look at the file properties (image size, colorspace, EXIF data, etc). I just need more free time to investigate this as I know little about the EXIF standard. I'm not sure yet whether this should be provided in this package, or in another package, e.g., the not-yet-existed EXIFViewer.jl.

But for this package, we can definitely have something like an exclusive keyword preferred_size = (256, 256) to aspect_ratio to make jpeg_decode generate the smallest output whose size are larger than (256, 256).

johnnychen94 commented 2 years ago

the not-yet-existed EXIFViewer.jl.

Maybe just another julia wrapper to https://github.com/libexif/libexif

lorenzoh commented 2 years ago

Right, a preferred_size keyword would definitely do for my use case 👍