Leak on Linux? - Githubissues

asilvas commented 7 years ago

Been troubleshooting a leak in https://github.com/asilvas/node-image-steam (processes millions of images every day) and originally thought it was in my project but after a number of heap dump checks I determined it wasn't a leak in V8.

In order to break it down the simplest parts I recorded the traffic in a serial form so it can be replayed in a pure sharp script.

https://gist.github.com/asilvas/474112440535051f2608223c8dc2fcdf

npm i sharp request

curl https://gist.githubusercontent.com/asilvas/474112440535051f2608223c8dc2fcdf/raw/be4e593c6820c0246acf2dc9604012653d71c353/sharp.js > sharp.js
curl https://gist.githubusercontent.com/asilvas/474112440535051f2608223c8dc2fcdf/raw/be4e593c6820c0246acf2dc9604012653d71c353/sharp.log > sharp.log

node sharp.js http://img1.wsimg.com/isteam sharp.log

It's downloading these files on the fly, which avoids any FS caching which will bloat memory usage, and forwards the instructions (in sharp.log) directly to sharp, one at a time.

Memory usage gets into 500MB+ within a few mins (at least on Docker+CentOS), and seems to eventually peak. On some systems I've seen over 2GB usage. Only processing a single image at a time should be pretty flat in memory usage. Have you seen this before? Any ideas? I wasn't aware of anything sharp/vips was doing that should be triggering Linux's file caching.

Edit: While memory usage on Mac is still higher than I expect for a single image processed at a time (~160MB) after a couple hundred images, it's nowhere near as high as on Linux.. And it seems to peak quickly. So it appears to be a linux only issue. Docker is also involved, so not ruling that out either.

wouterds commented 5 years ago

For me to investigate any claimed memory leak, I'll need a new issue with a complete, standalone code repo with everything required to consistently reproduce it.

I'm going to double-check this and suggestions above and if problem persist create an example repository that demonstrates the issue. Thanks!

cablespaghetti commented 5 years ago

I switched from Debian to Alpine (and admittedly Node 11 to 12):

Thanks to everyone who suggested this as a fix!

egekhter commented 4 years ago

Sharp is very fast but there's an obvious memory problem. As I send items to be processed to a worker, the memory never gets released at end of processing.

On a C5.x4large, I spawn 30 child processes. CPU ranges from 10%-30%. Memory however keeps growing until I run out of 32GB of ram in about 2 minutes.

I'll investigate and see if I can come back with a solution.

vinerz commented 4 years ago

@egekhter I can confirm that using jemalloc as allocator solves the problem.

egekhter commented 4 years ago

@egekhter I can confirm that using jemalloc as allocator solves the problem.

Thanks, it does seem to help a bit, but the problem is not solved on my end. I installed jemalloc and set the "LD_PRELOAD" environment variable and it kept memory usage lower for longer, however it still keeps climbing. If I stop sending new items for processing to my workers, the memory usage of the entire server does not decrease.

egekhter commented 4 years ago

Another update.

I originally switched from Jimp to Sharp as CPU usage was too high.

I noticed CPU usage went down but memory usage went up.

I switched from Sharp to GM.

With the only change being switching out the modules and changing the function calls (i.e from extract to crop), my used memory decreased from 32 GB to a stable 3GB.

There's not much we can do as a user from here.

lovell commented 4 years ago

@egekhter Whilst Node.js Worker Threads will probably help the single-threaded world of jimp, they won't offer much to help the multi-threaded world of sharp/libvips and can cause greater heap fragmentation faster.

30x worker threads each spawning 4x libuv threads each spawning 16x (c5.4xlarge vCPU) libvips threads is a concurrency of 1920 threads all allocating/freeing memory from the same pool.

If you'd like to "manage" concurrency via Worker Threads, then try setting sharp.concurrency to 1 so libvips doesn't also try to do so.

egekhter commented 4 years ago

@egekhter Whilst Node.js Worker Threads will probably help the single-threaded world of jimp, they won't offer much to help the multi-threaded world of sharp/libvips and can cause greater heap fragmentation faster.

30x worker threads each spawning 4x libuv threads each spawning 16x (c5.4xlarge vCPU) libvips threads is a concurrency of 1920 threads all allocating/freeing memory from the same pool.

If you'd like to "manage" concurrency via Worker Threads, then try setting sharp.concurrency to 1 so libvips doesn't also try to do so.

Solved my problem with your help.

sharp.concurrency(1);

Up to 70x child processes and still have plenty of memory left.

Thanks for all your work!

lovell commented 3 years ago

Please see https://github.com/lovell/sharp/issues/2607 for a change in the default concurrency for glibc-based Linux users that will be in v0.28.0.

FoxxMD commented 2 years ago

not related...but @vinerz what application were you using to monitor memory in this comment? Is that heroku's dashboard?

vinerz commented 2 years ago

not related...but @vinerz what application were you using to monitor memory in this comment? Is that heroku's dashboard?

Hey @FoxxMD, that's New Relic's application monitor for node 😄

alcidesbsilvaneto commented 1 year ago

I switched from Debian to Alpine (and admittedly Node 11 to 12):

Thanks to everyone who suggested this as a fix!

Still the solution for me in 2023.

nandi95 commented 1 year ago

Went from:

to:

It's a definite improvement.

rambo-panda commented 9 months ago

const sharp = require("sharp");
(async () => {
    await Promise.all([...Array(100)].map(() => sharp("./test.webp").rotate(90).toBuffer()));
})();

use libjemalloc.so cache:true concurrency : 1
use libjemalloc.so cache:false concurrency : 1
use libjemalloc.so cache:true concurrency : 16
use libjemalloc.so cache:false concurrency : 16

use glibc cache: false concurrency:1
use glibc cache:true concurrency:1
use glibc cache:false concurrency : 16
use glibc cache: true concurrency: 16

lovell / sharp

Leak on Linux? #955