JuliaIO / JpegTurbo.jl

Julia interface to libjpeg-turbo
MIT License
15 stars 7 forks source link

benchmark results against other backends #15

Open johnnychen94 opened 2 years ago

johnnychen94 commented 2 years ago

JPEG backends comparison

Even though JpegTurbo.jl provides more advanced and efficient in-memory features, the benchmark only tests the filename version because all other backends don't support this.

using JpegTurbo
using BenchmarkTools
using TestImages

img = testimage("cameraman");
filename = "tmp.jpg"

jpeg_encode(filename, img);
data = jpeg_encode(img);
@assert read(filename) == data;

@btime jpeg_encode($img); # 855.914 μs (7 allocations: 306.49 KiB)
@btime jpeg_encode(filename, $img); # 1.064 ms (20 allocations: 307.19 KiB)

@assert jpeg_decode(filename) == jpeg_decode(data)
@btime jpeg_decode($data); # 795.819 μs (18 allocations: 514.66 KiB)
@btime jpeg_decode(filename); # 836.992 μs (45 allocations: 630.62 KiB)

Generally speaking, for the filename version, JpegTurbo.jl and OpenCV (python) are the fastest versions since they are both backed by libjpeg-turbo.

v0.1.0 ``` Julia versioninfo: Julia Version 1.8.0-DEV.1434 Commit 4abf26eec8 (2022-01-30 20:04 UTC) Platform Info: OS: macOS (x86_64-apple-darwin18.7.0) CPU: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-13.0.0 (ORCJIT, skylake) Environment: JULIA_NUM_THREADS = 8 JpegTurbo.jl versioninfo: JpegTurbo.jl version: 0.1.0 libjpeg version: 62 libjpeg-turbo version: 2.1.0 bit mode: 8 SIMD: enabled OpenCV version: 4.5.5 OpenCV libjpeg-turbo version: 2.1.2-62 Scikit-image version: 0.19.1 ``` ## moonsurface Gray{N0f8} (256, 256) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 0.44 | 0.33 | 22.66 | 39.0516 | | ImageMagick.jl | 1.19 | 1.47 | 22.24 | 39.0533 | | QuartzImageIO.jl | 1.19 | 0.69 | 25.01 | 39.5761 | | OpenCV (Python) | 0.63 | 1.26 | 29.47 | 42.3855 | | Scikit-image | 1.02 | 2.06 | 10.83 | 34.0346 | ## cameraman Gray{N0f8} (512, 512) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 1.20 | 0.83 | 50.19 | 47.1206 | | ImageMagick.jl | 3.33 | 4.57 | 49.30 | 47.1297 | | QuartzImageIO.jl | 2.32 | 1.23 | 50.82 | 47.0878 | | OpenCV (Python) | 1.39 | 3.33 | 65.63 | 49.2061 | | Scikit-image | 1.86 | 5.07 | 27.55 | 41.9095 | ## pirate Gray{N0f8} (512, 512) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 1.31 | 1.05 | 79.84 | 40.9099 | | ImageMagick.jl | 3.81 | 5.03 | 78.51 | 40.9127 | | QuartzImageIO.jl | 2.94 | 1.49 | 81.77 | 41.3253 | | OpenCV (Python) | 1.56 | 3.80 | 104.68 | 43.5602 | | Scikit-image | 2.16 | 6.04 | 42.09 | 35.6561 | ## house Gray{N0f8} (512, 512) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 1.09 | 0.69 | 35.70 | 50.0640 | | ImageMagick.jl | 3.07 | 4.59 | 35.16 | 50.0741 | | QuartzImageIO.jl | 1.98 | 1.10 | 36.61 | 49.6511 | | OpenCV (Python) | 0.87 | 2.60 | 46.67 | 51.8188 | | Scikit-image | 1.37 | 4.76 | 20.61 | 45.5563 | ## rand Gray{Float64} (512, 512) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 2.27 | 1.79 | 215.91 | 38.3134 | | ImageMagick.jl | 4.31 | 6.15 | 189.28 | 38.3145 | | QuartzImageIO.jl | 4.48 | 2.43 | 218.95 | 39.1134 | | OpenCV (Python) | 2.73 | 4.83 | 257.61 | 42.3367 | | Scikit-image | 3.35 | 9.21 | 142.45 | 28.5258 | ## rand Gray{Float64} (4096, 4096) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 334.87 | 220.86 | 13795.17 | 38.3061 | | ImageMagick.jl | 403.20 | 603.74 | 12104.39 | 38.3053 | | QuartzImageIO.jl | 329.46 | 273.10 | 13851.35 | 39.1145 | | OpenCV (Python) | 158.54 | 235.52 | 16464.69 | 42.3227 | | Scikit-image | 209.01 | 507.61 | 9091.49 | 28.5394 | ## fabio RGB{N0f8} (512, 512) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 1.59 | 3.96 | 55.91 | 42.7003 | | ImageMagick.jl | 6.14 | 6.02 | 72.76 | 45.5593 | | QuartzImageIO.jl | 4.50 | 6.73 | 55.38 | 42.0154 | | OpenCV (Python) | 2.65 | 4.90 | 72.57 | 44.0539 | | Scikit-image | 4.48 | 13.57 | 31.68 | 37.8229 | ## barbara RGB{N0f8} (576, 720) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 2.48 | 6.85 | 140.21 | 36.1151 | | ImageMagick.jl | 10.56 | 10.19 | 179.70 | 38.1910 | | QuartzImageIO.jl | 7.65 | 10.58 | 139.84 | 36.0860 | | OpenCV (Python) | 5.11 | 8.69 | 185.88 | 37.2003 | | Scikit-image | 6.87 | 20.66 | 74.79 | 32.7438 | ## mandril RGB{N0f8} (512, 512) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 2.04 | 4.68 | 149.28 | 27.7261 | | ImageMagick.jl | 8.95 | 7.54 | 241.40 | 32.2466 | | QuartzImageIO.jl | 6.15 | 7.41 | 150.35 | 27.7677 | | OpenCV (Python) | 5.59 | 7.90 | 190.93 | 28.2401 | | Scikit-image | 5.08 | 16.98 | 76.89 | 25.6526 | ## coffee RGB{N0f8} (400, 600) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 1.53 | 3.83 | 78.10 | 36.1796 | | ImageMagick.jl | 6.23 | 5.68 | 100.48 | 38.2192 | | QuartzImageIO.jl | 4.47 | 5.95 | 78.63 | 36.1604 | | OpenCV (Python) | 2.99 | 5.75 | 102.26 | 37.4588 | | Scikit-image | 5.01 | 13.31 | 40.71 | 32.2566 | ## lighthouse RGB{N0f8} (512, 768) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 2.57 | 6.26 | 125.39 | 38.6723 | | ImageMagick.jl | 9.51 | 9.00 | 147.12 | 39.6910 | | QuartzImageIO.jl | 7.43 | 10.17 | 125.60 | 38.8406 | | OpenCV (Python) | 4.46 | 8.26 | 165.09 | 40.4860 | | Scikit-image | 7.82 | 21.04 | 63.88 | 33.8235 | ## earth_apollo RGB{N0f8} (3002, 3000) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 69.68 | 162.96 | 1779.18 | 39.5714 | | ImageMagick.jl | 218.13 | 221.63 | 2463.45 | 42.1515 | | QuartzImageIO.jl | 169.27 | 340.41 | 1734.75 | 39.5452 | | OpenCV (Python) | 89.01 | 142.87 | 2428.32 | 40.6026 | | Scikit-image | 144.32 | 349.93 | 906.01 | 37.6173 | ## rand RGB{Float64} (512, 512) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 5.31 | 5.36 | 248.71 | 12.7247 | | ImageMagick.jl | 10.62 | 9.05 | 446.51 | 31.8190 | | QuartzImageIO.jl | 8.03 | 8.06 | 249.97 | 12.7733 | | OpenCV (Python) | 4.58 | 7.48 | 300.47 | 12.7396 | | Scikit-image | 6.77 | 20.91 | 154.07 | 12.6310 | ## rand RGB{Float64} (4096, 4096) | Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) | | ------- | --------------- | --------------- | ---------------- | -------- | | JpegTurbo.jl | 863.61 | 465.04 | 15877.02 | 12.7284 | | ImageMagick.jl | 888.20 | 872.52 | 28547.04 | 31.8186 | | QuartzImageIO.jl | 613.77 | 711.31 | 15949.90 | 12.8192 | | OpenCV (Python) | 279.60 | 376.26 | 19189.29 | 12.7814 | | Scikit-image | 412.91 | 966.76 | 9828.86 | 12.6804 |
lorenzoh commented 2 years ago

Here with some throughput-oriented benchmarks comparing JpegTurbo.jl and ImageMagick.jl

Run on 1.6.2, since 1.7 has much worse ImageMagick.jl performance.

To benchmark, I use the first 1024 images of the ImageNette dataset. This comes in 3 sizes: (1) original image size, and each image cropped so that the shortest side is at least (2) 160px or (3) 320px.

The data containers were created as follows:

# loading file paths using FastAI
using FastAI
N = 1024
imagefiles = MLUtils.datasubset(MLUtils.filterobs(isimagefile, FileDataset(datasetpath("imagenette2"))), 1:N)
imagefiles_160 = MLUtils.datasubset(MLUtils.filterobs(isimagefile, FileDataset(datasetpath("imagenette2-160"))), 1:N)

using ImageMagick, JpegTurbo, MLUtils
data_magick = MLUtils.mapobs(ImageMagick.load, imagefiles)
data_turbo = MLUtils.mapobs(JpegTurbo.jpeg_decode, imagefiles)
I then timed iterating over each image using MLUtils.eachobs (single-threaded) I report the minimum time and memory consumption. Provider Dataset Time (sec) Memory consumption (MiB) Notes
ImageMagick ImageNette (full) 8.5 2100
JpegTurbo ImageNette (full) 6.6 1518
ImageMagick ImageNette (160px) 1.22 308
JpegTurbo ImageNette (160px) 0.87 217

I then ran the same benchmarks using and MLUtils.eachobsparallel (from here) (multi-threaded, -t 12 and 12 physical cores).

Provider Dataset Time (sec) Memory consumption (MiB) Notes
ImageMagick ImageNette (full) 1.33 2100
JpegTurbo ImageNette (full) 0.97 1518
ImageMagick ImageNette (160px) 0.182 308
JpegTurbo ImageNette (160px) 0.12 217

We can see that JpegTurbo consistently beats ImageMagick, not just in runtime but also memory performance.


While JpegTurbo offers a speedup in itself, we can see that loading less image (i.e. the 160px versions) makes a bigger difference. Often images are stored in much larger sizes than needed, so storing a downsized version of the dataset helps in loading speed. JpegTurbo, however, can also load an image in at a reduced size directly, using the preferred_size or scale_ratio keyword arguments.

Here I benchmark loading in the full-size files with preferred_size = (160, 160) and compare it to loading in the downscaled files.

Dataset Time (sec) Memory consumption (MiB) Notes
ImageNette (full) 6.6 1518 no kwargs
ImageNette (full) 2.2 393 preferred_size = (160, 160)
ImageNette (160px) 0.87 308 no kwargs
ImageNette (full) 5.0 1000 preferred_size = (320, 320)
ImageNette (320px) 3.35 850 no kwargs

As we can see, on-the-fly downsizing does not quite match the performance of presizing, but compares very favorably with loading the full images. This should be even more pronounced for datasets where the source images are larger.

johnnychen94 commented 2 years ago

Loading the 1.2G pixel image https://www.flickr.com/photos/trevor_dobson_inefekt69/29314390837 on my MacBook pro

[^1]: need to override the default limit before loading cv2: os.environ["OPENCV_IO_MAX_IMAGE_PIXELS"] = pow(2, 40).__str__()

johnnychen94 commented 11 months ago

v0.1.4 (#30) reduces the loading time of the above 1.2G pixel image from 19.2s to 9.2s on my mac mini (M2 Pro)