pierrec / node-lz4

LZ4 fast compression algorithm for NodeJS
MIT License
438 stars 98 forks source link

Memory Leaks #73

Open melancthon opened 5 years ago

melancthon commented 5 years ago

The following code generates memory leaks and eventually crashes.

I've tried it with node v11.10.0 and v10.15.1LTS on both Windows 7 SP1 and CentOS 7.

Any ideas?

const lz4 = require("lz4")

for(let i=0;i< 10000; i++){
    let a = lz4.decode(lz4.encode("0123456789")) // memory leak!!
    if(i%100==0){
        let res = process.memoryUsage()
        console.log("mem:", res.external.toLocaleString(), res.heapTotal.toLocaleString(), res.heapUsed.toLocaleString(), res.rss.toLocaleString())
    }
}
ankon commented 4 years ago

Small data point here: By using the useJS option the leak seems to be considerably smaller.

const lz4 = require(".");

const opts = {useJS: true};

for (let i = 0; i < 10000; i++) {
    let a = lz4.decode(lz4.encode("0123456789", opts), opts); // memory leak!!
    if (i % 100 == 0) {
        let res = process.memoryUsage();
        console.log(
            "mem:",
            res.external.toLocaleString(),
            res.heapTotal.toLocaleString(),
            res.heapUsed.toLocaleString(),
            res.rss.toLocaleString()
        );
    }
}
ankon commented 4 years ago

I extended the test tool a bit as I found the initial output very scary, but after reviewing the binding code here and playing with some theories I think there is no real leak here -- rather what happens is that NodeJS/V8 figure out that running a GC isn't needed in this example.

The extended test code:

try {
    const gc = require('gc-stats')();
    gc.on('stats', stats => {
        const { gctype, pauseMS, diff: {usedHeapSize}} = stats;
        console.log(`gc ${gctype}: ${usedHeapSize}\t(${pauseMS}ms)`);
    });
} catch (err) {
    // Fine.
}

const lz4 = require('.');

const opts = {useJS: process.env.USE_JS === 'true'};
const iterations = Number(process.argv[2]) || 10000;
const forceGc = process.env.FORCE_GC === 'true';
const taskDelay = typeof process.env.TASK_DELAY !== undefined ? Number(process.env.TASK_DELAY) : undefined;

function runGc() {
    // Run with --expose-gc to make this available
    if (global.gc) {
        global.gc();
    }
}

function printStats(prefix) {
    let res = process.memoryUsage();
    console.log(
        `mem (${prefix}):\t`,
        [
            res.external.toLocaleString(),
            res.heapTotal.toLocaleString(),
            res.heapUsed.toLocaleString(),
            res.rss.toLocaleString(),
            res.arrayBuffers.toLocaleString(),
        ].map(s => s.padStart(12)).join('\t')
    );
}

function delay(ms) {
    return new Promise(resolve => {
        setTimeout(resolve, ms);
    });
}

function test() {
    let encoded = lz4.encode('0123456789', opts);
    let decoded = lz4.decode(encoded, opts); // memory leak?
}

async function main(what, iterations) {
    let iteration = 1;
    do {
        console.log(`Running iteration ${iteration}`);
        runGc();
        printStats('initial');
        for (let i = 0; i < iterations; i++) {
            // Run the code-under-test
            what();

            // Do some statistics
            if (i % 100 == 0) {
                printStats('before');
                if (forceGc) {
                    runGc();
                    printStats('after');
                }
                if (typeof taskDelay !== 'undefined') {
                    await delay(taskDelay);
                }
            }
        }
        printStats('final');
        await delay(5000);
    } while (iteration++);
}

main(test, iterations);

Some things to try:

  1. Limit the memory using --max-old-space-size=5. This works both with the native bindings and the JS implementation. Anything lower and NodeJS crashes immediately for me (Linux/x86_64)
  2. Run an event loop iteration every few test iterations by setting TASK_DELAY=0 (or any other integer value): This immediately shows that GCs are running when the gc-stats package is installed, and the memory statistics start behaving very differently.
  3. Try to force a GC every few test iterations (using FORCE_GC=true): This doesn't change much unless there is a delay as well. Probably because the GC is triggered as a part of the event loop?
  4. Just do nothing and let it run multiple iterations: Even just that shows that GCs are happening the moment the test waits for a while, and essentially things continue to run forever.

Basically: the leak here is a leak because the test loop is so tight that NodeJS/V8 simply don't run a GC.