lovell / sharp

High performance Node.js image processing, the fastest module to resize JPEG, PNG, WebP, AVIF and TIFF images. Uses the libvips library.
https://sharp.pixelplumbing.com
Apache License 2.0
29.13k stars 1.3k forks source link

Alpine much slower? #1224

Closed asilvas closed 6 years ago

asilvas commented 6 years ago

Using identical hardware & configuration, I'm seeing a 3-4x perf drop switching from centos.

concurrency: 4
simd: true
cache: false

vips 8.6.3-r0

Anyone else experience this slowness?

On a positive note, memory usage was drastically improved (by ~10x).

jcupitt commented 6 years ago

Sorry, I can't think of anything off hand that could cause that. We'd need to make a minimal test case that shows a difference.

Is it a general slowness, or is there some specific operation that's much slower?

lovell commented 6 years ago

"Using identical... configuration" "vips is being installed from the latest apk on Alpine"

Alpine packages are built with -Os by default (optimised for binary size) whereas Centos uses, I think, -O3 (optimised for performance). Compiler features such as auto-vectorisation are skipped with the former setting.

jcupitt commented 6 years ago

Good point Lovell, that could certainly make a difference.

You could also try enabling INFO output from libvips, it dumps some stuff about which path is being used internally.

john@kiwi:~/pics$ vips thumbnail wtc.jpg x.jpg 128 --vips-info
VIPS-INFO: 08:33:21.918: thumbnailing wtc.jpg
VIPS-INFO: 08:33:21.921: selected loader is VipsForeignLoadJpegFile
VIPS-INFO: 08:33:21.921: input size is 9372 x 9372
VIPS-INFO: 08:33:21.921: loading jpeg with factor 8 pre-shrink
VIPS-INFO: 08:33:21.922: converting to processing space srgb
VIPS-INFO: 08:33:21.922: shrinkv by 4
VIPS-INFO: 08:33:21.922: shrinkv sequential line cache
VIPS-INFO: 08:33:21.922: shrinkh by 4
VIPS-INFO: 08:33:21.922: residual reducev by 0.437233
VIPS-INFO: 08:33:21.922: reducev: 15 point mask
VIPS-INFO: 08:33:21.924: reducev: using vector path
VIPS-INFO: 08:33:21.924: reducev sequential line cache
VIPS-INFO: 08:33:21.924: residual reduceh by 0.437233
VIPS-INFO: 08:33:21.924: reduceh: 15 point mask
asilvas commented 6 years ago

Thanks guys, I'll get an isolated apples to apples and see where we stand.

asilvas commented 6 years ago

To avoid having to install a ton of packages to get vips-tools working on centos & alpine docker images, I created a simple gist to benchmark sharp directly: https://gist.github.com/asilvas/77fe6433eb335bfc3cbc578df7ae5027

Only requires sharp and async to be installed. node sharp-bench.js

Initial results (run locally on mac hardware, but consistent with what we saw in the prod rollout):

centos image:

resize took 79ms: https://isteam.wsimg.com/stock/588/:/fm=f:webp
resize took 30ms: https://isteam.wsimg.com/stock/588?download=true
resize took 129ms: https://isteam.wsimg.com/stock/588?useOriginal=true&download=true
jpeg took 90ms: https://isteam.wsimg.com/stock/588/:/fm=f:webp
jpeg took 71ms: https://isteam.wsimg.com/stock/588?download=true
jpeg took 127ms: https://isteam.wsimg.com/stock/588?useOriginal=true&download=true
png took 409ms: https://isteam.wsimg.com/stock/588/:/fm=f:webp
png took 380ms: https://isteam.wsimg.com/stock/588?download=true
png took 461ms: https://isteam.wsimg.com/stock/588?useOriginal=true&download=true
webp took 393ms: https://isteam.wsimg.com/stock/588/:/fm=f:webp
webp took 404ms: https://isteam.wsimg.com/stock/588?download=true
webp took 469ms: https://isteam.wsimg.com/stock/588?useOriginal=true&download=true

alpine image:

resize took 150ms: https://isteam.wsimg.com/stock/588/:/fm=f:webp
resize took 29ms: https://isteam.wsimg.com/stock/588?download=true
resize took 187ms: https://isteam.wsimg.com/stock/588?useOriginal=true&download=true
jpeg took 184ms: https://isteam.wsimg.com/stock/588/:/fm=f:webp
jpeg took 75ms: https://isteam.wsimg.com/stock/588?download=true
jpeg took 157ms: https://isteam.wsimg.com/stock/588?useOriginal=true&download=true
png took 486ms: https://isteam.wsimg.com/stock/588/:/fm=f:webp
png took 401ms: https://isteam.wsimg.com/stock/588?download=true
png took 491ms: https://isteam.wsimg.com/stock/588?useOriginal=true&download=true
webp took 927ms: https://isteam.wsimg.com/stock/588/:/fm=f:webp
webp took 802ms: https://isteam.wsimg.com/stock/588?download=true
webp took 892ms: https://isteam.wsimg.com/stock/588?useOriginal=true&download=true

This tells me two things so far:

  1. When decoding webp (the first image), alpine is much slower (~80ms slower) compared to centos
  2. When encoding webp (last 3 tests), it's taking roughly 2x the time to do nothing but decode from source and encode to webp.

I'm not seeing any other significant deltas on perf. Thoughts?

asilvas commented 6 years ago

Some additional data of interest... Sharp reports:

centos (sharp 0.20.2):

versions: 
   { cairo: '1.14.12',
     croco: '0.6.12',
     exif: '0.6.21',
     expat: '2.2.5',
     ffi: '3.2.1',
     fontconfig: '2.12.6',
     freetype: '2.9',
     gdkpixbuf: '2.36.11',
     gif: '5.1.4',
     glib: '2.55.1',
     gsf: '1.14.42',
     harfbuzz: '1.7.4',
     jpeg: '1.5.3',
     lcms: '2.9-',
     orc: '0.4.28',
     pango: '1.41.0',
     pixman: '0.34.0',
     png: '1.6.34',
     svg: '2.42.0',
     tiff: '4.0.9-cda4b06',
     vips: '8.6.1',
     webp: '0.6.1',
     xml: '2.9.7',
     zlib: '1.2.11' }

alpine (sharp 0.20.2):

versions: { vips: '8.6.3' }

Global sharp options are identical:

concurrency: 4
simd: false
cache: { memory: { current: 0, high: 0, max: 50 },
  files: { current: 0, max: 20 },
  items: { current: 0, max: 100 }
}

I'm doubtful the minor vips version is related, but what's interesting to me is how sharp is reporting versions.

jcupitt commented 6 years ago

I would try timing webp in isolation, for example:

$ time dwebp quagga.webp -ppm -o x.ppm
Decoded quagga.webp. Dimensions: 4120 x 2747 . Format: lossy. Now saving...
Saved file x.ppm

real    0m0.179s
user    0m0.146s
sys 0m0.029s

Use ppm output or you'll just be timing PNG save.

For resize, sharp will be spending the bulk of time in the image format decode / encode libraries, so compiler differences there would have a large effect, as Lovell said.

asilvas commented 6 years ago

centos ranges 90-120ms, while Alpine ranges 190-220ms just to decode a medium-sized image. Everything else I've seen with Alpine has performed comparable, or within 20% of centos, so this will probably be a deal breaker unless there is something that can be done to speed up webp. I take it we're at the mercy of Google on this one, unless we try to create our own builds?

jcupitt commented 6 years ago

webp is easy to build and has quite a lot of options, like some ASM stuff, so I think I would experiment with making my own alpine package. It should be possible to tune it up a bit.

webp is rather a young, immature library, so you can also expect dramatic speed changes between minor versions.

asilvas commented 6 years ago

Closing since this isn't a sharp issue it seems. Thanks