lovell / sharp

High performance Node.js image processing, the fastest module to resize JPEG, PNG, WebP, AVIF and TIFF images. Uses the libvips library.
https://sharp.pixelplumbing.com
Apache License 2.0
29.11k stars 1.29k forks source link

Leak on Linux? #955

Closed asilvas closed 6 years ago

asilvas commented 7 years ago

Been troubleshooting a leak in https://github.com/asilvas/node-image-steam (processes millions of images every day) and originally thought it was in my project but after a number of heap dump checks I determined it wasn't a leak in V8.

In order to break it down the simplest parts I recorded the traffic in a serial form so it can be replayed in a pure sharp script.

https://gist.github.com/asilvas/474112440535051f2608223c8dc2fcdf

npm i sharp request

curl https://gist.githubusercontent.com/asilvas/474112440535051f2608223c8dc2fcdf/raw/be4e593c6820c0246acf2dc9604012653d71c353/sharp.js > sharp.js
curl https://gist.githubusercontent.com/asilvas/474112440535051f2608223c8dc2fcdf/raw/be4e593c6820c0246acf2dc9604012653d71c353/sharp.log > sharp.log

node sharp.js http://img1.wsimg.com/isteam sharp.log

It's downloading these files on the fly, which avoids any FS caching which will bloat memory usage, and forwards the instructions (in sharp.log) directly to sharp, one at a time.

Memory usage gets into 500MB+ within a few mins (at least on Docker+CentOS), and seems to eventually peak. On some systems I've seen over 2GB usage. Only processing a single image at a time should be pretty flat in memory usage. Have you seen this before? Any ideas? I wasn't aware of anything sharp/vips was doing that should be triggering Linux's file caching.

Edit: While memory usage on Mac is still higher than I expect for a single image processed at a time (~160MB) after a couple hundred images, it's nowhere near as high as on Linux.. And it seems to peak quickly. So it appears to be a linux only issue. Docker is also involved, so not ruling that out either.

lovell commented 7 years ago

Hello, how is memory usage being measured? If RSS, please remember this includes free memory that has not (yet) been returned to the OS, which explains why different OSs report different RSS for the same task.

If you've not seen then, there have been quite a few related questions previously: https://github.com/lovell/sharp/search?utf8=%E2%9C%93&q=rss+%22returned+to+the+OS%22&type=Issues

asilvas commented 7 years ago

Yes, RSS being the main indicator. But I also rely on buff/cache and avail memory to better understand "free-able" memory, and this indicates that this memory is never being released back. This memory doesn't seem to reside in the V8 memory space, as indicated from dozens of heap dump tests.

Thanks for links to the other issues that seem connected. I'm not entirely sure the issue is fully understood though. I'm not convinced it's an issue with sharp either, but I'm hoping we can work around the problem as the impact is quite significant. We're using ~5x the memory we should be, which becomes a big deal when you're serving (many) millions of requests/day.

I'll continue investigating the related cases. So far I've found no workaround to the problem that doesn't involve not using toBuffer.

lovell commented 7 years ago

It's worth subscribing to https://github.com/nodejs/node/issues/1671 for V8 updates that will improve GC of Buffer objects.

If you're not already doing so, you might want to experiment with a different memory allocator such as jemalloc. You'll probably see less fragmentation, but that's still dealing with the effect rather than the cause.

asilvas commented 7 years ago

Will do, thanks.

Probably no surprise, but I was at least able to correlate usage with the concurrency setting.

In my isolated (sharp-only) test these were the findings:

0 concurrency: 276MB (should be detecting 4 in my local setup)
1 concurrency: 190MB
4 concurrency: 276MB
8 concurrency: 398MB (our prod env)
16 concurrency: 500MB
lovell commented 7 years ago

Does the prod environment use 8 real CPU cores or is this "vCPU" hyper-threading? If the latter, perhaps also experiment halving concurrency to improve throughput (and reduce the memory effects).

asilvas commented 7 years ago

I am artificially limiting cores to keep memory in check, but at the cost of up to 30% slower response times. Temporary test.

asilvas commented 7 years ago

Feel free to close this as a duplicate of others. But from everything I've learned of the problem thus far there doesn't seem to be any conclusive evidence that this is a Node and/or V8 issue. From the symptoms of the isolated tests I've run (as well as others) it does seem to be any issue with sharp or vips as this isn't a common problem in the node community to see run-away memory increases of this nature. I was able to verify this is not a case of GC'd memory not being released back to the OS, confirming that this memory was in-use and any reduction in available memory resulting in memory allocation failure. But as I said, nothing conclusive either way.

I tried to investigate ways to resolve/workaround the problem within Sharp but was unsuccessful -- hopefully someone with more expertise with V8 native modules will have better luck.

lovell commented 7 years ago

"I was able to verify this is not a case of GC'd memory not being released back to the OS"

Could memory fragmentation explain this?

asilvas commented 7 years ago

I haven't proven/disproven that theory. But with the modest number of objects being processed to achieve such high memory usage, it'd seem to require some pretty severe fragmentation to justify this.

Is your thoughts that it's fragmentation in V8 or native space?

lovell commented 7 years ago

Are you using Node 8? If not, do you see the same RSS levels with it?

Were you able to try jemalloc? It provides useful debugging via malloc_stats_print.

Given you're using CentOS, have you tried disabling transparent huge pages?

asilvas commented 7 years ago

Not using 8 in prod, but yes was able to reproduce similar RSS levels (with 8.5.0) in the isolated test.

Might be a bit before I can look into the other options but will keep in mind, thanks.

lovell commented 7 years ago

If sharpen or blur operations are being used then the small leak fixed in https://github.com/jcupitt/libvips/issues/771 may be related here.

asilvas commented 7 years ago

The test sample for this topic doesn't use those two operations so probably unrelated. But we do use them on occasion, thanks!

trev-dev commented 6 years ago

I've found that running my sharp modules in a child_process spawn that exits once it's completed works really well for me. It keeps the memory load down.

asilvas commented 6 years ago

Was hoping to avoid spawning a child process, but it is something I had considered as well. It's manageable at the moment so holding out for now.

lovell commented 6 years ago

@asilvas The sharp tests just revealed a memory leak on one possible libvips error path when using toBuffer (or pipe output) with JPEG output - see https://github.com/jcupitt/libvips/pull/835

asilvas commented 6 years ago

Excellent find (and fix), @lovell ! Thanks, these sort of fixes make a big difference when processing millions of images. Any idea when this fix will be available?

lovell commented 6 years ago

@asilvas The next libvips v8.6.1 patch release should contain this fix, which then allows the release of sharp v0.19.0.

kishorgandham commented 6 years ago

libvips v8.6.1 is out https://github.com/jcupitt/libvips/releases/tag/v8.6.1

lovell commented 6 years ago

@asilvas Are you seeing an improvement with the latest libvips/sharp?

asilvas commented 6 years ago

In testing, will let you know next week.

vinerz commented 6 years ago

I am currently having the same issue.

Stack information: Heroku on Ubuntu Server 16.04.3 Node 9.5.0

Libraries versions: LIBVIPS=8.6.2 LIBWEBP=0.6.1 LIBTIFF=4.0.9 LIBSVG=2.42.2 LIBGIF=5.1.4

Sharp: Version: 0.19.0

I also use a lot of JPEG toBuffer.

screen shot 2018-02-07 at 04 19 27 screen shot 2018-02-07 at 04 19 47

Even under stress the V8 heap doesn't change, but the non-heap memory grows consistently on every request.

Before the update my memory usage was steadily at 300MB using libvips 7.42.3 and sharp 0.17.1

asilvas commented 6 years ago

Seeing similar results after ~24 hours in production:

image image

Overall memory usage patterns seems to be a bit improved, but still far higher than I'd expect (eventually reaching 2GB, perhaps related to prior suggestions). I have noticed some perf improvements overall, though it could be due to the relatively short life of the new containers. I might revisit doing some more memory profiling at some point, but it'll have to wait for now.

I'll let you know if any new data surfaces.

vinerz commented 6 years ago

@asilvas would you mind sharing the throughput of your servers and if they are many different images?

asilvas commented 6 years ago

@vinerz We generate over 20 million images per day, from millions of source images. Overall throughput is much higher, but that part is unrelated to this topic. Powered by https://github.com/asilvas/node-image-steam, and of course sharp+libvips.

vinerz commented 6 years ago

Thanks for the answer!

Based on this information, I can see that my leakage is getting larger much, much faster than yours, even tough manipulating only 200 thousand images per day from around 50 thousand different sources.

It might be related to the fact that I use toBuffer a lot in my code due to filters / resizing chains.

I'll try disabling libvips cache to see what happens.

asilvas commented 6 years ago

We use toBuffer for the final result of every image as well: https://github.com/asilvas/node-image-steam/blob/b96c3d39bc7b125f552b1cef0d1dfa05be3b488e/lib/processor/processor.js#L103

Our sharp options include:

cache: false
concurrency: 4
simd: true

I've toyed with options quite a bit in the past, but might be worth revisiting with the recent changes/fixes.

vinerz commented 6 years ago

I modified the core app flow to use a single sharp object and changed all toBuffer chain to a single Stream piped directly to the Express Response, but I am getting the same memory results. It might related to something else.

Currently using cache: false and concurrency: 2

lovell commented 6 years ago

@asilvas Thank you for the detailed updates!

@vinerz Your comments mention "Node 9.5.0" and "Before the update... using libvips 7.42.3 and sharp 0.17.1". I suspect you were using a different version of Node "before the update" too. If so, does returning to the previous version make any difference?

mrbatista commented 6 years ago

@lovell same problem with 0.20.2 version on Heroku (stack heroku-16) Node 8.11.2. Currently using cache: false and concurrency: 2 and use sharp with pipe. The RSS increase quickly and never released back. Same application on my Mac release RSS correctly.

schermata 2018-05-18 alle 17 05 15

P.s. The decrease of RSS in attached image is due from application restart.

asilvas commented 6 years ago

Moving to alpine has pretty much resolved our memory issues.

lovell commented 6 years ago

@mrbatista Please remember RSS includes free memory that has not (yet) been returned to the OS. This explains why different operating systems with different memory allocators report different RSS for the same task.

@asilvas Glad to hear that Alpine's use of the musl memory allocator reduces fragmentation and involves returning free memory to the OS.

nicmosc commented 6 years ago

@mrbatista having the exact same issue as you (RSS is good on mac, overflows on heroku), though I'm not using pip nor concurrency. From what I've been reading it looks like it could be related to the toBuffer method. Are you using that anywhere by any chance? I've tried pretty much everything to solve this from happening on the deployed app but with no success. I'm using:

node: 9.8.0 sharp: 0.20.3

And I'm also running the node process with these options: "start": "node --expose_gc --max_semi_space_size=2 --max_old_space_size=256 src/index.js"

vinerz commented 6 years ago

Hey guys, I've upgraded my stack to:

The leak is much less intense, but the memory usage is still 3 times higher the usage prior to this issue.

screen shot
asilvas commented 6 years ago

@vinerz for comparison, this is what image-steam looks like on Alpine. Image Steam is a full web-api for image processing, much heavier than sharp by itself, and also depends on toBuffer. Additionally, usage is normally only 100MB, so the graph below actually has a known small mem leak due to image-steam (that will be fixed soon), not sharp.

image

And this is a heavy traffic system. Millions/images per day.

lovell commented 6 years ago

@nicmosc @vinerz RSS reaching a plateau but not hitting OOM is indicative of free memory not being released back to the OS. It is not indicative of a leak.

Using a different memory allocator will help, as @asilvas has clearly proven (thank you!).

bradleyayers commented 6 years ago

I too observe no out of control RSS growth on my local macOS computer, but do see it in production on Heroku.

I dug into this more, and found that I'm able to reproduce the RSS growth using the heroku/heroku:16 docker image locally. I've compared it to a node/node image and RSS is well behaved. I went so far as to use the same pre-compiled version of node in both images (even though node/node comes with one), and the same node_modules install (mounted /app to my local machine, and re-used the same yarn install):

wget https://nodejs.org/dist/v10.7.0/node-v10.7.0-linux-x64.tar.xz
./node-v10.7.0-linux-x64/bin/node src/index.js

I compared the diff of ldd but I see nothing apparent:

heydovetail/node:node-10.3.0-yarn-1.7.0 ``` $ ldd /app/node-v10.7.0-linux-x64/bin/node linux-vdso.so.1 (0x00007fffe61b8000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f9958e4e000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f9958c46000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f995893b000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f995863a000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f9958424000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f9958207000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9957e5c000) /lib64/ld-linux-x86-64.so.2 (0x00007f9959052000) $ ldd /app/node_modules/sharp/build/Release/sharp.node linux-vdso.so.1 (0x00007fffc6d0b000) libvips-cpp.so.42 => /app/node_modules/sharp/build/Release/../../vendor/lib/libvips-cpp.so.42 (0x00007f2bfe341000) libvips.so.42 => /app/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42 (0x00007f2bfdd11000) libglib-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libglib-2.0.so.0 (0x00007f2bfd99c000) libgobject-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0 (0x00007f2bfd747000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f2bfd43c000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f2bfd13b000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f2bfcf25000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2bfcb7a000) libpng16.so.16 => /app/node_modules/sharp/build/Release/../../vendor/lib/libpng16.so.16 (0x00007f2bfc941000) libz.so.1 => /app/node_modules/sharp/build/Release/../../vendor/lib/libz.so.1 (0x00007f2bfc728000) libtiff.so.5 => /app/node_modules/sharp/build/Release/../../vendor/lib/libtiff.so.5 (0x00007f2bfc4b2000) libjpeg.so.8 => /app/node_modules/sharp/build/Release/../../vendor/lib/libjpeg.so.8 (0x00007f2bfc236000) libgmodule-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgmodule-2.0.so.0 (0x00007f2bfc033000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f2bfbe2b000) libexpat.so.1 => /app/node_modules/sharp/build/Release/../../vendor/lib/libexpat.so.1 (0x00007f2bfbbff000) libgsf-1.so.114 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgsf-1.so.114 (0x00007f2bfb9b5000) libxml2.so.2 => /app/node_modules/sharp/build/Release/../../vendor/lib/libxml2.so.2 (0x00007f2bfb6cb000) liborc-0.4.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/liborc-0.4.so.0 (0x00007f2bfb434000) liblcms2.so.2 => /app/node_modules/sharp/build/Release/../../vendor/lib/liblcms2.so.2 (0x00007f2bfb1cc000) libgif.so.7 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgif.so.7 (0x00007f2bfafc3000) librsvg-2.so.2 => /app/node_modules/sharp/build/Release/../../vendor/lib/librsvg-2.so.2 (0x00007f2bfaca5000) libgio-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgio-2.0.so.0 (0x00007f2bfa8da000) libgdk_pixbuf-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgdk_pixbuf-2.0.so.0 (0x00007f2bfa6ae000) libcairo.so.2 => /app/node_modules/sharp/build/Release/../../vendor/lib/libcairo.so.2 (0x00007f2bfa3b7000) libwebpmux.so.3 => /app/node_modules/sharp/build/Release/../../vendor/lib/libwebpmux.so.3 (0x00007f2bfa1ac000) libwebp.so.7 => /app/node_modules/sharp/build/Release/../../vendor/lib/libwebp.so.7 (0x00007f2bf9f18000) libexif.so.12 => /app/node_modules/sharp/build/Release/../../vendor/lib/libexif.so.12 (0x00007f2bf9cd2000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f2bf9ab5000) libffi.so.6 => /app/node_modules/sharp/build/Release/../../vendor/lib/libffi.so.6 (0x00007f2bf98ad000) /lib64/ld-linux-x86-64.so.2 (0x00007f2bfe7a5000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f2bf96a9000) libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f2bf9492000) libpangocairo-1.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libpangocairo-1.0.so.0 (0x00007f2bf9284000) libpangoft2-1.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libpangoft2-1.0.so.0 (0x00007f2bf906d000) libharfbuzz.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libharfbuzz.so.0 (0x00007f2bf8dc6000) libpango-1.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libpango-1.0.so.0 (0x00007f2bf8b74000) libgthread-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgthread-2.0.so.0 (0x00007f2bf8973000) libpixman-1.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libpixman-1.so.0 (0x00007f2bf86b5000) libfontconfig.so.1 => /app/node_modules/sharp/build/Release/../../vendor/lib/libfontconfig.so.1 (0x00007f2bf8468000) libfreetype.so.6 => /app/node_modules/sharp/build/Release/../../vendor/lib/libfreetype.so.6 (0x00007f2bf81b1000) libcroco-0.6.so.3 => /app/node_modules/sharp/build/Release/../../vendor/lib/libcroco-0.6.so.3 (0x00007f2bf7f75000) ```
heroku/heroku:16 ``` # ldd /app/node-v10.7.0-linux-x64/bin/node linux-vdso.so.1 => (0x00007fff73758000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f0b5df86000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f0b5dd7e000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f0b5d9fc000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f0b5d6f3000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f0b5d4dd000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f0b5d2c0000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0b5cef6000) /lib64/ld-linux-x86-64.so.2 (0x00007f0b5e18a000) # ldd /app/node_modules/sharp/build/Release/sharp.node linux-vdso.so.1 => (0x00007ffeb45b9000) libvips-cpp.so.42 => /app/node_modules/sharp/build/Release/../../vendor/lib/libvips-cpp.so.42 (0x00007f9fb2439000) libvips.so.42 => /app/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42 (0x00007f9fb1e09000) libglib-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libglib-2.0.so.0 (0x00007f9fb1a94000) libgobject-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0 (0x00007f9fb183f000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f9fb14bd000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f9fb11b4000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f9fb0f9e000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9fb0bd4000) libpng16.so.16 => /app/node_modules/sharp/build/Release/../../vendor/lib/libpng16.so.16 (0x00007f9fb099b000) libz.so.1 => /app/node_modules/sharp/build/Release/../../vendor/lib/libz.so.1 (0x00007f9fb0782000) libtiff.so.5 => /app/node_modules/sharp/build/Release/../../vendor/lib/libtiff.so.5 (0x00007f9fb050c000) libjpeg.so.8 => /app/node_modules/sharp/build/Release/../../vendor/lib/libjpeg.so.8 (0x00007f9fb0290000) libgmodule-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgmodule-2.0.so.0 (0x00007f9fb008d000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f9fafe85000) libexpat.so.1 => /app/node_modules/sharp/build/Release/../../vendor/lib/libexpat.so.1 (0x00007f9fafc59000) libgsf-1.so.114 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgsf-1.so.114 (0x00007f9fafa0f000) libxml2.so.2 => /app/node_modules/sharp/build/Release/../../vendor/lib/libxml2.so.2 (0x00007f9faf725000) liborc-0.4.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/liborc-0.4.so.0 (0x00007f9faf48e000) liblcms2.so.2 => /app/node_modules/sharp/build/Release/../../vendor/lib/liblcms2.so.2 (0x00007f9faf226000) libgif.so.7 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgif.so.7 (0x00007f9faf01d000) librsvg-2.so.2 => /app/node_modules/sharp/build/Release/../../vendor/lib/librsvg-2.so.2 (0x00007f9faecff000) libgio-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgio-2.0.so.0 (0x00007f9fae934000) libgdk_pixbuf-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgdk_pixbuf-2.0.so.0 (0x00007f9fae708000) libcairo.so.2 => /app/node_modules/sharp/build/Release/../../vendor/lib/libcairo.so.2 (0x00007f9fae411000) libwebpmux.so.3 => /app/node_modules/sharp/build/Release/../../vendor/lib/libwebpmux.so.3 (0x00007f9fae206000) libwebp.so.7 => /app/node_modules/sharp/build/Release/../../vendor/lib/libwebp.so.7 (0x00007f9fadf72000) libexif.so.12 => /app/node_modules/sharp/build/Release/../../vendor/lib/libexif.so.12 (0x00007f9fadd2c000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f9fadb0f000) libffi.so.6 => /app/node_modules/sharp/build/Release/../../vendor/lib/libffi.so.6 (0x00007f9fad907000) /lib64/ld-linux-x86-64.so.2 (0x00007f9fb289d000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f9fad703000) libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f9fad4e8000) libpangocairo-1.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libpangocairo-1.0.so.0 (0x00007f9fad2da000) libpangoft2-1.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libpangoft2-1.0.so.0 (0x00007f9fad0c3000) libharfbuzz.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libharfbuzz.so.0 (0x00007f9face1c000) libpango-1.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libpango-1.0.so.0 (0x00007f9facbca000) libgthread-2.0.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libgthread-2.0.so.0 (0x00007f9fac9c9000) libpixman-1.so.0 => /app/node_modules/sharp/build/Release/../../vendor/lib/libpixman-1.so.0 (0x00007f9fac70b000) libfontconfig.so.1 => /app/node_modules/sharp/build/Release/../../vendor/lib/libfontconfig.so.1 (0x00007f9fac4be000) libfreetype.so.6 => /app/node_modules/sharp/build/Release/../../vendor/lib/libfreetype.so.6 (0x00007f9fac207000) libcroco-0.6.so.3 => /app/node_modules/sharp/build/Release/../../vendor/lib/libcroco-0.6.so.3 (0x00007f9fabfcb000) ```

Any suggestions what else I should be looking at to understand the behaviour in the heroku:16 image?

Update: The same RSS growth behaviour occurs in heroku:18 as heroku:16. Here's the memory information after resizing a 10MB image ~100 times. The space isn't consumed by the heap, instead it's elsewhere in the RSS. Presumably allocations by libvips?

``` { "heapSpaceStatistics": [ { "space_name": "read_only_space", "space_size": 0, "space_size_si": "0 B", "space_used_size": 0, "space_used_size_si": "0 B", "space_available_size": 0, "space_available_size_si": "0 B", "physical_space_size": 0, "physical_space_size_si": "0 B" }, { "space_name": "new_space", "space_size": 1048576, "space_size_si": "1.0 MB", "space_used_size": 946856, "space_used_size_si": "946.9 kB", "space_available_size": 84312, "space_available_size_si": "84.3 kB", "physical_space_size": 964224, "physical_space_size_si": "964.2 kB" }, { "space_name": "old_space", "space_size": 357310464, "space_size_si": "357.3 MB", "space_used_size": 339528624, "space_used_size_si": "339.5 MB", "space_available_size": 11278200, "space_available_size_si": "11.3 MB", "physical_space_size": 356260688, "physical_space_size_si": "356.3 MB" }, { "space_name": "code_space", "space_size": 4194304, "space_size_si": "4.2 MB", "space_used_size": 3620992, "space_used_size_si": "3.6 MB", "space_available_size": 301632, "space_available_size_si": "301.6 kB", "physical_space_size": 4051456, "physical_space_size_si": "4.1 MB" }, { "space_name": "map_space", "space_size": 5267456, "space_size_si": "5.3 MB", "space_used_size": 3022976, "space_used_size_si": "3.0 MB", "space_available_size": 2147856, "space_available_size_si": "2.1 MB", "physical_space_size": 4759832, "physical_space_size_si": "4.8 MB" }, { "space_name": "large_object_space", "space_size": 25337856, "space_size_si": "25.3 MB", "space_used_size": 23878576, "space_used_size_si": "23.9 MB", "space_available_size": 1133559296, "space_available_size_si": "1.1 GB", "physical_space_size": 25337856, "physical_space_size_si": "25.3 MB" } ], "heapStatistics": { "total_heap_size": 393158656, "total_heap_size_si": "393.2 MB", "total_heap_size_executable": 5767168, "total_heap_size_executable_si": "5.8 MB", "total_physical_size": 391382888, "total_physical_size_si": "391.4 MB", "total_available_size": 1147362464, "total_available_size_si": "1.1 GB", "used_heap_size": 371006856, "used_heap_size_si": "371.0 MB", "heap_size_limit": 1526909922, "heap_size_limit_si": "1.5 GB", "malloced_memory": 16384, "malloced_memory_si": "16.4 kB", "peak_malloced_memory": 12772376, "peak_malloced_memory_si": "12.8 MB", "does_zap_garbage": 0, "does_zap_garbage_si": "0 B" }, "memoryUsage": { "rss": 990388224, "rss_si": "990.4 MB", "heapTotal": 393158656, "heapTotal_si": "393.2 MB", "heapUsed": 371010040, "heapUsed_si": "371.0 MB", "external": 408875, "external_si": "408.9 kB" } } ```
lovell commented 6 years ago

@bradleyayers Have you tried different memory allocators that return free RSS back to the OS?

bradleyayers commented 6 years ago

@lovell no I haven’t, I’ve never looked into that before so I’ll need to do some reading and see how I go. If you’ve got any tips that’d be great!

papandreou commented 6 years ago

They’re researching jemalloc in node core now: https://github.com/nodejs/node/issues/21973 🤗

danieler1981 commented 5 years ago

Has anyone solved the memory leak problem (high memory usage) with SHARP and Heroku(nodeJS)? At that moment I have not figure out how to avoid memory to hit 500Mb after some minutes of sharp usage.

mrbatista commented 5 years ago

@danieler1981 I solved with custom docker build based on linux Alpine distro.

polarathene commented 5 years ago

@mrbatista Alpine can have some issues with sharp when using <8.7.1 of libvips(sharp presently only provides 8.7.0). Are you building your own libvips/sharp? Or are you using Alpines package via edge? Otherwise, there is potential for failure due to a bug. I can reproduce it via the using-gatsby-image starter project for Gatsby.js.

I'd be interested in your Dockerfile if it's doing anything special or building it's own libvips/sharp.


I also have a project where at least with debian stretch docker image, it's hitting 2GB during offloading to sharp to process images, and not dropping after. It may not be released, but as my memory is close to system limits, it's a bit concerning that it could cause a crash(not a fan of OOM's decision making on what to kill). I'd verify on Alpine but it's segfaulting on this project(it wasn't earlier, I assume it's something to do with /tmp on the host that it appears to share/write to? which is odd since I would think it'd be isolated from my host /tmp), presently /tmp has a lot of temporary content/directories from sharp in prior runs, haven't been in a situation to be able to restart the host yet which may fix the alpine image from segfaulting.

I'm only processing 23 source images(150MB JPEGS), total of 300ish outputs, is 2GB RAM usage reasonable to reserve and hold after for that?

asilvas commented 5 years ago

Alpine works great. We've spent quite some time diagnosing the memory issues and it's pretty clear it's not directly related to Sharp. https://github.com/lovell/sharp/issues/955#issuecomment-397734714

mrbatista commented 5 years ago

@polarathene I use Alpine edge.

gmichalec-pandora commented 5 years ago

For anyone else who ends up here - I can concur that changing the memory allocator resolves the leak issue. For whatever reason, I did could not reproduce the leaks when developing on my debian linux machine, but when running in a docker container, I was seeing memory usage increase significantly with each upload. I fixed it by adding the following lines to my debian-based dockerfile:

RUN apt-get update && apt-get install --force-yes -yy \
  libjemalloc1 \
  && rm -rf /var/lib/apt/lists/*

# Change memory allocator to avoid leaks
ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1
oxi-p commented 5 years ago

My App runs on Lambda, I am going through memory running out and crash. #1710

I tried to find a way to choose the memory allocator in lambda but no luck

vinerz commented 5 years ago

UPDATE: One year later, I finally had time to rewrite all the system core and took the opportunity to inject jemalloc on the same deploy. It really did the job and fixed the memory issues. Hooray!

For anyone wondering how to do that in Heroku, use the jemalloc buildpack.

I disabled jemalloc and the memory usage went through the roof again, so we can be certain that the memory allocator does all the difference here.

We can finally downgrade our dynos!

wouterds commented 5 years ago

There's definitely still a memory leak. Every image processed by sharp I see memory usage of the container (Docker, Node Alpine) increasing until it eventually hits the limit and crashes. Container restarts by itself and the story repeats itself.

lovell commented 5 years ago

@wouterds The memory allocator in musl, upon which Alpine Linux is based, is generally considered very good in terms of lack of fragmentation and returning freed memory.

If you're using Node.js inside a memory-constrained container then please make sure you're using v12.7.0 or later to avoid running into the problems that https://github.com/nodejs/node/pull/27508 fixes in Node.js itself.

For me to investigate any claimed memory leak, I'll need a new issue with a complete, standalone code repo with everything required to consistently reproduce it.