"Heap out of memory" when using analyze plugins

aelij commented 3 weeks ago

When running analyze on a snapshot with ~180 million nodes, I'm getting a "JavaScript heap out of memory" (full output below). I've tried increasing the heap size (--max-old-space-size=8192) but it didn't help.

I think it's caused by a limit on JS object property count. This commit https://github.com/aelij/memlab/commit/2049c71319152fa94df1227b977392d621b00a4c seems to fix the issue by changing 2 objects to arrays, but it causes a lot of tests to fail, and without delving deeply into memlab's code it's hard for me to understand why.

Can you please try to integrate this fix?

Thanks!

``` parsing Heap.20240812.145432.32252.0.001.heapsnapshot ... calculating basic meta info... building reference index... building referrers index... propagating detachedness state... building node index... <--- Last few GCs ---> [8728:0000021125A14EA0] 20800 ms: Scavenge 654.3 (1704.0) -> 471.8 (1521.5) MB, 25.6 / 0.0 ms (average mu = 0.982, current mu = 0.983) allocation failure; <--- JS stacktrace ---> FATAL ERROR: invalid table size Allocation failed - JavaScript heap out of memory 1: 00007FF65C4E5F6F node_api_throw_syntax_error+174175 2: 00007FF65C469356 v8::internal::wasm::WasmCode::safepoint_table_offset+59926 3: 00007FF65C46AFF0 v8::internal::wasm::WasmCode::safepoint_table_offset+67248 4: 00007FF65CF16754 v8::Isolate::ReportExternalAllocationLimitReached+116 5: 00007FF65CF01AE2 v8::Isolate::Exit+674 6: 00007FF65CD837AC v8::internal::EmbedderStackStateScope::ExplicitScopeForTesting+124 7: 00007FF65C7BF979 v8::internal::Isolate::FatalProcessOutOfHeapMemory+25 8: 00007FF65CB7FB5B v8::internal::HashTable::New+187 9: 00007FF65CC65C63 v8::debug::Script::GetIsolate+62179 10: 00007FF65CC64A0D v8::debug::Script::GetIsolate+57485 11: 00007FF65CBE5447 v8::internal::JSObject::NormalizeElements+247 12: 00007FF65CC341C9 v8::internal::FeedbackNexus::ic_state+34777 13: 00007FF65CC2E122 v8::internal::FeedbackNexus::ic_state+10034 14: 00007FF65CBD96F9 v8::internal::JSObject::AddDataElement+1113 15: 00007FF65CB85923 v8::internal::StringSet::Add+1491 16: 00007FF65CB9C1F4 v8::internal::Object::SetProperty+148 17: 00007FF65CA92013 v8::internal::Runtime::SetObjectProperty+243 18: 00007FF65CA90F70 v8::internal::Runtime::GetObjectProperty+15024 19: 00007FF65CFB4651 v8::internal::SetupIsolateDelegate::SetupHeap+558257 20: 00007FF5DD1CBDB5 ```

JacksonGL commented 3 weeks ago

@aelij Thanks for reporting this issue. The change likely failed tests because node IDs are integers bigger or equal to the size of the Uint32Array. Let me create a commit to address this.

In the meantime, to unblock yourself quickly, consider replacing NumericDictionary with a class that shards Map (e.g., a two-level indirect map should be able to handle 180 million kv mapping).

JacksonGL commented 3 weeks ago

@aelij Please let me know if the commit has resolved the invalid table size Allocation failed issue (I tried but didn't manage to capture a heap snapshot big enough that reproduces the issue).

aelij commented 3 weeks ago

I'm getting an error:

RangeError: Map maximum size exceeded
    at Map.set (<anonymous>)
    at NumericDictionary.set (/workspaces/memlab/packages/core/dist/lib/heap-data/utils/NumericDictionary.js:61:75)
    at HeapSnapshot._buildNodeIdx (/workspaces/memlab/packages/core/dist/lib/heap-data/HeapSnapshot.js:250:34)
    at HeapSnapshot._buildMetaData (/workspaces/memlab/packages/core/dist/lib/heap-data/HeapSnapshot.js:205:14)
    at new HeapSnapshot (/workspaces/memlab/packages/core/dist/lib/heap-data/HeapSnapshot.js:83:14)
    at Object.<anonymous> (/workspaces/memlab/packages/core/dist/lib/HeapParser.js:120:21)
    at Generator.next (<anonymous>)
    at fulfilled (/workspaces/memlab/packages/core/dist/lib/HeapParser.js:15:58)

If you want to generate a test snapshot, try this:

import { writeHeapSnapshot } from 'v8'

const items = Array.from({ length: 100000 }, (_, i) => ({ s: 'a'.repeat(i), i }));
writeHeapSnapshot();
items.push({ s: 'a'.repeat(10000), i: 10000 });

Thanks @JacksonGL

aelij commented 3 weeks ago

BTW wouldn't a BigUint64Array be enough for _nodeId2NodeIdx?

JacksonGL commented 3 weeks ago

@aelij Thanks. That's interesting, do you mind sharing which OS and node.js version you are using? The current fix uses a two-level indirect map, with 50 million maximum node in the second level map. Could you try reducing the number here and recompile with npm run build, then test with node ./packages/memlab/bin/memlab analyze ...? https://github.com/facebook/memlab/blob/main/packages/core/src/lib/heap-data/utils/NumericDictionary.ts#L17

node.js does not support pushing too many elements into a single array, the same reason why the memlab heap parser failed here. So I am still not able to get a heap snapshot with 180 million nodes.

Previously when I failed to capture a heap snapshot big enough, I was trying to create a heap snapshot with 180 million nodes, but failed with a crash using this script:

const arr = [];
for (let i = 0; i < 200; ++i) {
    arr[i] = [];
    for (let j = 0; j < 1000000; j++) {
        arr[i][j] = {i: i, j: j};
    }
}

BTW wouldn't a BigUint64Array be enough for _nodeId2NodeIdx?

BigUint64Array like other typed arrays have a fixed length, the index of _nodeId2NodeIdx is node id, which is determined by v8 and can have any value bigger than the length of _nodeId2NodeIdx.

aelij commented 3 weeks ago

Changing it to 10 million worked.

I'm running this on GitHub Codespaces from this repo. You can create one yourself by clicking the green Code button on the repo's main page.

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.6 LTS
Release:        20.04
Codename:       focal
$ node -v
v20.15.1

aelij commented 3 weeks ago

Perhaps the threshold should be configurable from the CLI?

facebook / memlab

"Heap out of memory" when using analyze plugins #125