lovell / sharp

High performance Node.js image processing, the fastest module to resize JPEG, PNG, WebP, AVIF and TIFF images. Uses the libvips library.
https://sharp.pixelplumbing.com
Apache License 2.0
28.94k stars 1.29k forks source link

Converting to AVIF takes 100% of all cores, and 2.5GB+ RAM #2597

Closed jcontonio closed 3 years ago

jcontonio commented 3 years ago

Are you using the latest version? Is the version currently in use as reported by npm ls sharp the same as the latest version as reported by npm view sharp dist-tags.latest?

Yes - 0.27.2

What are the steps to reproduce?

Converting a 4000px wide jpg to avif causes cpu to spike to 400% and takes 2.5gb of RAM vs a webp conversion taking 20% CPU and 200mb of RAM.

What is the expected behaviour?

Conversion to take up less RAM and CPU

Are you able to provide a minimal, standalone code sample, without other dependencies, that demonstrates this problem?

const avifOptions = {
  quality: 50,
  lossless: false,
  speed: 8, // default is 5
  chromaSubsampling: '4:2:0',
}

// AVif original
// -------------------------------------------------------------------------
await sharp(path.join(inputDirectory, file))
  .toFormat('avif')
  .avif(avifOptions)
  .toFile(path.join(outputDirectory, `test.avif`))

// Avif half
// -------------------------------------------------------------------------
await sharp(path.join(inputDirectory, file))
  .resize({ width: halfWidth })
  .toFormat('avif')
  .avif(avifOptions)
  .toFile(path.join(outputDirectory, `test@0.5x.webp`))

Are you able to provide a sample image that helps explain the problem?

DSC_1147-4000_s

What is the output of running npx envinfo --binaries --system?

macOS:

  System:
    OS: macOS 11.2
    CPU: (8) x64 Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
    Memory: 1.70 GB / 40.00 GB
    Shell: 5.8 - /bin/zsh
  Binaries:
    Node: 14.15.4 - ~/.nvm/versions/node/v14.15.4/bin/node
    npm: 6.14.10 - ~/.nvm/versions/node/v14.15.4/bin/npm

Debian:

  System:
    OS: Linux 5.4 Debian GNU/Linux 9 (stretch) 9 (stretch)
    CPU: (8) x64 Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
    Memory: 270.71 MB / 1.94 GB
    Container: Yes
    Shell: 4.4.12 - /bin/bash
  Binaries:
    Node: 12.18.2 - /usr/local/bin/node
    Yarn: 1.22.4 - /usr/local/bin/yarn
    npm: 6.14.5 - /usr/local/bin/npm
lovell commented 3 years ago

Yes, this is expected as AVIF compression is relatively slow and memory hungry compared with most other formats.

Using the following code sample and your test image:

await sharp("109366574-9f44d980-7848-11eb-82f5-fffa19163db1.jpg")
  .avif({ speed: 8 })
  .toFile("out.avif")

...when run locally on Linux (x64 CPU with AVX2) via callgrind using locally compiled and installed libaom, libheif and libvips with symbols intact, I see the top 10 "hottest" functions are:

74,895,067,224  PROGRAM TOTALS

9,771,380,136  /build/glibc-ZN95T4/glibc-2.31/string/../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:__memset_avx2_unaligned_erms [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
9,227,999,443  ???:av1_optimize_txb [/usr/local/lib/libaom.so.2.0.0]
4,863,071,423  ???:prune_intra_mode_with_hog [/usr/local/lib/libaom.so.2.0.0]
4,185,373,503  ???:search_tx_type [/usr/local/lib/libaom.so.2.0.0]
3,096,177,196  ???:av1_dr_prediction_z2_avx2 [/usr/local/lib/libaom.so.2.0.0]
1,993,966,420  ???:av1_predict_intra_block [/usr/local/lib/libaom.so.2.0.0]
1,940,325,014  ???:build_intra_predictors [/usr/local/lib/libaom.so.2.0.0]
1,257,895,707  ???:block_rd_txfm [/usr/local/lib/libaom.so.2.0.0]
1,026,892,929  ???:av1_cost_coeffs_txb [/usr/local/lib/libaom.so.2.0.0]
  866,556,342  ???:lowbd_inv_txfm2d_add_no_identity_avx2 [/usr/local/lib/libaom.so.2.0.0]

These are all from within libaom or invoked by it.

By setting speed to 8, you're telling libaom to favour performance over all else. It's possible you might find that slowing it down also reduces memory consumption for some tasks.

The prebuilt binaries provided by sharp use libaom as it is the reference de/encoder with good cross-platform support. There is a plan to replace it with the slightly faster dav1d (decode) and rav1e (encode) combination, but I'm unsure of their relative memory consumption.

jcontonio commented 3 years ago

It seems any speed setting drives my CPU up to 400% and RAM to 2+GB. I think we'll probably ditch AVIF support in our app until that space matures a little more. Thanks for the detailed reply!

lovell commented 3 years ago

No worries, please subscribe to #2604 for updates about the migration to dav1d+rav1e.

tom10271 commented 1 year ago

Sharing some of the results we tested with different parameters for converting images to AVIF then store in S3 by Lambda function. We ran 10 times and secSpent is the average seconds spent. Just in case people that do not know, more memory Lambda have, more CPU credits we can spend in the lambda function.

The source image is 300KB with mainly gradient background with text on it, simply a banner. Setting effort to 0 and 1 are both fast, 2 is very slow already and 3+ are like 1 extra second per effort marginally added.

From 1 to 2 it further reduce the size by 25% which is amazing, from 1 to 9 the size would be reduced by 45% down from 20KB to 11KB. If effort 9 would run fast, this would be amazing as our company is paying huge price on CDN.

Lambda RAM settings: 1024MB
{
  queryParameters: { noCache: 1, format: 'avif', effort: 0 },
  secSpent: 1.1503999999999999,
  sizeKB: 32.4169921875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 1 },
  secSpent: 1.2476,
  sizeKB: 20.07421875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 2 },
  secSpent: 5.9789,
  sizeKB: 14.7392578125
}

Lambda RAM settings: 2048MB
{
  queryParameters: { noCache: 1, format: 'avif', effort: 0 },
  secSpent: 1.0310000000000001,
  sizeKB: 32.4169921875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 1 },
  secSpent: 1.0988,
  sizeKB: 20.07421875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 2 },
  secSpent: 3.5351,
  sizeKB: 14.7392578125
}

Lambda RAM settings: 4096MB
{
  queryParameters: { noCache: 1, format: 'avif', effort: 0 },
  secSpent: 1.0706,
  sizeKB: 32.4169921875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 1 },
  secSpent: 0.9771000000000001,
  sizeKB: 20.07421875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 2 },
  secSpent: 3.2674,
  sizeKB: 14.7392578125
}

Lambda RAM settings: 10240MB
{
  queryParameters: { noCache: 1, format: 'avif', effort: 0 },
  secSpent: 1.0502,
  sizeKB: 32.4169921875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 1 },
  secSpent: 1.0059,
  sizeKB: 20.07421875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 2 },
  secSpent: 3.3095,
  sizeKB: 14.7392578125
}

However, if it is a photographic image, raising effort the file size result in more or less the same.

{
  queryParameters: { noCache: 1, format: 'avif', effort: 0, w: 720 },
  secSpent: 1.461,
  sizeKB: 53.400390625
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 1, w: 720 },
  secSpent: 1.021,
  sizeKB: 52.974609375
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 2, w: 720 },
  secSpent: 1.199,
  sizeKB: 52.91015625
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 3, w: 720 },
  secSpent: 1.597,
  sizeKB: 54.6259765625
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 4, w: 720 },
  secSpent: 2.994,
  sizeKB: 54.435546875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 5, w: 720 },
  secSpent: 4.378,
  sizeKB: 54.2451171875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 6, w: 720 },
  secSpent: 6.452,
  sizeKB: 53.8046875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 7, w: 720 },
  secSpent: 10.169,
  sizeKB: 54.201171875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 8, w: 720 },
  secSpent: 12.263,
  sizeKB: 54.091796875
}
{
  queryParameters: { noCache: 1, format: 'avif', effort: 9, w: 720 },
  secSpent: 23.532,
  sizeKB: 53.9697265625
}
{
  queryParameters: { noCache: 1, format: 'webp', effort: 0, q: 60, w: 720 },
  secSpent: 1.033,
  sizeKB: 74.51953125
}
{
  queryParameters: { noCache: 1, format: 'webp', effort: 1, q: 60, w: 720 },
  secSpent: 1.084,
  sizeKB: 72.654296875
}
{
  queryParameters: { noCache: 1, format: 'webp', effort: 2, q: 60, w: 720 },
  secSpent: 1.02,
  sizeKB: 63.33984375
}
{
  queryParameters: { noCache: 1, format: 'webp', effort: 3, q: 60, w: 720 },
  secSpent: 0.847,
  sizeKB: 60.62109375
}
{
  queryParameters: { noCache: 1, format: 'webp', effort: 4, q: 60, w: 720 },
  secSpent: 1.12,
  sizeKB: 60.708984375
}
{
  queryParameters: { noCache: 1, format: 'webp', effort: 5, q: 60, w: 720 },
  secSpent: 0.752,
  sizeKB: 58.93359375
}
{
  queryParameters: { noCache: 1, format: 'webp', effort: 6, q: 60, w: 720 },
  secSpent: 1.175,
  sizeKB: 56.84375
}
{
  queryParameters: { noCache: 1, format: 'JPEG', q: 80, w: 720 },
  secSpent: 0.988,
  sizeKB: 108.4150390625
}
alexmacarthur commented 7 months ago

It doesn't seem like the resource demands for the AVIF conversion process have lessened much, but if anyone has any ideas on how I can improve performance, I'd love to hear them.

VM details:

image

Here's what converting to AVIF is doing:

Screen Shot 2024-01-31 at 9 50 11 PM