Daninet / hash-wasm

Lightning fast hash functions using hand-tuned WebAssembly binaries
https://npmjs.com/package/hash-wasm
Other
858 stars 49 forks source link

OOM when run benchmark #14

Closed bytemain closed 3 years ago

bytemain commented 3 years ago

benchmark code: https://gist.github.com/lengthmin/86c4cf3b02d4280aafad71e0f1b04376

when I run the above benchmark, it raise error OOM: image

and I just use while(1) md5(_10k); it raise OOM as well image

bytemain commented 3 years ago

in MacOS 11

both node 12/14 and blake3/md5 will raise OOM Error

maybe have memory leak

Daninet commented 3 years ago

I traced it back to a memory leak in Node.js's Buffer.from() function: https://github.com/nodejs/node/issues/38300

bytemain commented 3 years ago

One more point

I write some wasm example by using Rust: https://github.com/lengthmin/node-hash-benchmark

has not meeted OOM problem

bytemain commented 3 years ago

and the error raised in node 12.. so maybe not Buffer.from problem

Daninet commented 3 years ago

There is an issue with your benchmark code. The md5() function returns a promise and you don't have an await there. If you want to make sync benchmarks then you can use the createMD5() function to get a sync interface. With node v12.22.1 I don't see memory leaks when I use await md5() or createMD5()

bytemain commented 3 years ago

how to use createMD5() as a sync operation? can you give me a example?

still oom

libs.forEach(([name, lib]) => {
  sizes.forEach(([sizeName, content]) => {
    suite.add(`${name}#${sizeName}`, () => {
      createMD5().then((v) => {
        v.init();
        v.update(contents);
        return v.digest();
      });
    });
  });
});

image

Daninet commented 3 years ago

createMD5() is doing async compilation and initialization of the WASM binary and it should go outside of the loop.

// before benchmark
const hasher = await createMD5();

// in benchmark loop
hasher.init();
hasher.update(data);
hasher.digest();
bytemain commented 3 years ago

well, that works

createMD5 is a promise..

const { createMD5 } = require('hash-wasm');
console.log('start', process.memoryUsage());
for (let i = 0; i < 1e6; i++) {
  new Promise(async (resolve) => {
    const hasher = await createMD5();
    resolve(hasher);
  });
  if (i % 1e3 === 0) {
    gc();
    console.log('step', i, 'rss', process.memoryUsage().rss);
  }
}
gc();
console.log('end', process.memoryUsage());
start {
  rss: 20869120,
  heapTotal: 5132288,
  heapUsed: 2946944,
  external: 1005669,
  arrayBuffers: 232088
}
step 0 rss 22200320
step 1000 rss 32305152
step 2000 rss 42303488
step 3000 rss 52396032
step 4000 rss 53264384
step 5000 rss 59506688
step 6000 rss 64008192
step 7000 rss 70402048
step 8000 rss 77012992
step 9000 rss 84070400
step 10000 rss 90566656
step 11000 rss 97431552
step 12000 rss 104558592
step 13000 rss 111386624
step 14000 rss 118321152
step 15000 rss 125153280
step 16000 rss 131887104
step 17000 rss 139706368
step 18000 rss 146907136
step 19000 rss 154718208
step 20000 rss 161570816
step 21000 rss 168488960
step 22000 rss 175427584
step 23000 rss 182239232
step 24000 rss 189063168

so is this a memory leak?

maybe related promise it's own

Daninet commented 3 years ago

Creating thousands of new promises synchronously without giving them a chance to resolve is not something you would normally do. I don't understand why did you put the createMD5() inside the loop.

const { createMD5 } = require('hash-wasm');
const str = 'x'.repeat(1000);

console.log('start', process.memoryUsage());

createMD5().then(hasher => {
  for (let i = 0; i < 1e6; i++) {
    hasher.init();
    hasher.update(str);
    hasher.digest();
    if (i % 1e5 === 0) {
      gc();
      console.log('step', i, 'rss', process.memoryUsage().rss);
    }
  }
  gc();
  console.log('end', process.memoryUsage());
});
bytemain commented 3 years ago

actually, I know I should not put it in loop, and I just want to know is it a promise-related problem. and now I understand it.

Thanks for your help!