facebook / memlab

A framework for finding JavaScript memory leaks and analyzing heap snapshots
https://facebook.github.io/memlab/
MIT License
4.43k stars 121 forks source link

Using a basic `retainerReferenceFilter` causes Node OOM exception #117

Closed singh-sp closed 6 months ago

singh-sp commented 7 months ago

Followed the example from here: https://facebook.github.io/memlab/docs/api/interfaces/core_src.ILeakFilter/#-optional-retainerreferencefilter-referencefiltercallback, but getting Node out of memory exception.

Tried Node 16, 18 and 21 with 8GB memory. The snapshots are approximately 35 mb each -- can be reproduced using Memlab examples.

Cursor_and_tmux

Seems to be happening during/after this function call: https://github.com/facebook/memlab/blob/3b05576df932b30b491106ffe522bf82d6f62a8f/packages/core/src/lib/Utils.ts#L1818

JacksonGL commented 7 months ago

@singh-sp It is probably because your retainerReferenceFilter callback returned true for edges that shouldn't be used. For example, self-referencing edges cause infinite loop when traversing from a node to the GC root. MemLab's default edge filter excludes these kinds of edges, but MemLab's path finder didn't consider the case where external leak filter may bypass MemLab's internal edge filter.

I will make a patch to MemLab so that its path finder will accommodate external leak filter better. You can also use the following code to unblock for now:

  retainerReferenceFilter(edge, _snapshot, isReferenceUsedByDefault) {
    // memlab by default removes self-referencing edges 
    // and other V8 internal and hidden edges
    if (!isReferenceUsedByDefault) {
      return false;
    }
    // your logic to filter other edges
    // return true or false
  }

Let me know if this fixes the OOM issue.

singh-sp commented 7 months ago

Thank you! That worked! 🎉

In simple terms:

retainerReferenceFilter(edge, _snapshot, isReferenceUsedByDefault) {
   // default case
    return isReferenceUsedByDefault
}

Quick question:

By providing a custom retainerReferenceFilter, are users overriding any default behaviors? (other than what you mentioned)

JacksonGL commented 7 months ago

@singh-sp retainerReferenceFilter is exposed as an API so that you can override the edge filter used by the retainer trace generator. Edges returned as false by retainerReferenceFilter won't be used for calculating the retainer trace.

If you meant to use the default edge filter, there is no need to provide the retainerReferenceFilter callback in the leak filter.

JacksonGL commented 7 months ago

By providing a custom retainerReferenceFilter, are users overriding any default behaviors? (other than what you mentioned)

It will also affect the retainer size calculation and dominator calculation: https://developer.chrome.com/docs/devtools/memory-problems/get-started

singh-sp commented 7 months ago

Thanks for the explanation! That's helpful. I plan to use this filter to exclude certain edges (false positive leaks).

I ran some tests, and it looks promising.

I couldn't find anything in docs, but is there a utility function that can parse the returned object ISerializedInfo[] from the findLeaks() function? I plan to display detected leaks in a certain way, and it would be nice to access the leaked objects' names and other metadata. Currently, this object has dynamic keys, making it hard to access nested objects.

JacksonGL commented 7 months ago

I plan to use this filter to exclude certain edges (false positive leaks).

If the edges your filter excluded were the edge that must go through from the GC root to the leaked object, then it will remove the object (i.e., the false positive) from the final result.

A bit more context: MemLab first uses the leakFilter callback to identify memory leak objects (if no leakFilter callback is provided then MemLab uses its default object filter); then MemLab uses retainerReferenceFilter to calculate the retainer trace (i.e., path to GC root) for those identified leaks.

is there a utility function that can parse the returned object ISerializedInfo[] from the findLeaks() function?

Unfortunately, there is no utility function for parsing ISerializedInfo, which has the following format:

{
 "0: edge0Name object0Name": {object0}
 "1: edge1Name object1Name": {object1}
 "2: edge2Name object2Name": {object2}
 ...
}

If you would like to dive deeper, here is the code that generates ISerializedInfo: https://github.com/facebook/memlab/blob/0f11b29f81d558bb81e5050e68c4a475f6d0978c/packages/core/src/lib/Serializer.ts#L486

JacksonGL commented 6 months ago

Closing this issue as the patch has been released in memlab@1.1.47