Closed aelij closed 3 weeks ago
@aelij Thanks for reporting this issue. The change likely failed tests because node IDs are integers bigger or equal to the size of the Uint32Array
. Let me create a commit to address this.
In the meantime, to unblock yourself quickly, consider replacing NumericDictionary
with a class that shards Map (e.g., a two-level indirect map should be able to handle 180 million kv mapping).
@aelij Please let me know if the commit has resolved the invalid table size Allocation failed
issue (I tried but didn't manage to capture a heap snapshot big enough that reproduces the issue).
I'm getting an error:
RangeError: Map maximum size exceeded
at Map.set (<anonymous>)
at NumericDictionary.set (/workspaces/memlab/packages/core/dist/lib/heap-data/utils/NumericDictionary.js:61:75)
at HeapSnapshot._buildNodeIdx (/workspaces/memlab/packages/core/dist/lib/heap-data/HeapSnapshot.js:250:34)
at HeapSnapshot._buildMetaData (/workspaces/memlab/packages/core/dist/lib/heap-data/HeapSnapshot.js:205:14)
at new HeapSnapshot (/workspaces/memlab/packages/core/dist/lib/heap-data/HeapSnapshot.js:83:14)
at Object.<anonymous> (/workspaces/memlab/packages/core/dist/lib/HeapParser.js:120:21)
at Generator.next (<anonymous>)
at fulfilled (/workspaces/memlab/packages/core/dist/lib/HeapParser.js:15:58)
If you want to generate a test snapshot, try this:
import { writeHeapSnapshot } from 'v8'
const items = Array.from({ length: 100000 }, (_, i) => ({ s: 'a'.repeat(i), i }));
writeHeapSnapshot();
items.push({ s: 'a'.repeat(10000), i: 10000 });
Thanks @JacksonGL
BTW wouldn't a BigUint64Array
be enough for _nodeId2NodeIdx
?
@aelij Thanks. That's interesting, do you mind sharing which OS and node.js version you are using? The current fix uses a two-level indirect map, with 50 million maximum node in the second level map. Could you try reducing the number here and recompile with npm run build
, then test with node ./packages/memlab/bin/memlab analyze ...
?
https://github.com/facebook/memlab/blob/main/packages/core/src/lib/heap-data/utils/NumericDictionary.ts#L17
node.js does not support pushing too many elements into a single array, the same reason why the memlab heap parser failed here. So I am still not able to get a heap snapshot with 180 million nodes.
Previously when I failed to capture a heap snapshot big enough, I was trying to create a heap snapshot with 180 million nodes, but failed with a crash using this script:
const arr = [];
for (let i = 0; i < 200; ++i) {
arr[i] = [];
for (let j = 0; j < 1000000; j++) {
arr[i][j] = {i: i, j: j};
}
}
BTW wouldn't a BigUint64Array be enough for _nodeId2NodeIdx?
BigUint64Array
like other typed arrays have a fixed length, the index of _nodeId2NodeIdx
is node id, which is determined by v8 and can have any value bigger than the length of _nodeId2NodeIdx
.
Changing it to 10 million worked.
I'm running this on GitHub Codespaces from this repo. You can create one yourself by clicking the green Code button on the repo's main page.
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal
$ node -v
v20.15.1
Perhaps the threshold should be configurable from the CLI?
When running analyze on a snapshot with ~180 million nodes, I'm getting a "JavaScript heap out of memory" (full output below). I've tried increasing the heap size (
--max-old-space-size=8192
) but it didn't help.I think it's caused by a limit on JS object property count. This commit https://github.com/aelij/memlab/commit/2049c71319152fa94df1227b977392d621b00a4c seems to fix the issue by changing 2 objects to arrays, but it causes a lot of tests to fail, and without delving deeply into memlab's code it's hard for me to understand why.
Can you please try to integrate this fix?
Thanks!