Open davepagurek opened 7 months ago
Hi @davepagurek thank you for reporting this and for your patience! I am working adding some documentation to this repo and archiving it. We have moved current ml5.js development to a new repository:
https://github.com/ml5js/ml5-next-gen
The latest examples are here:
https://editor.p5js.org/ml5/collections/pUzWMkdmE
We are planning to launch a new version of the library + website in June.
The new BodyPose
functionality of ml5.js uses updated models behind the scenes (MoveNet
and BlazePose
). @ziyuan-linn have you noticed any signs of a memory leak like this in our newer examples?
Just skimming the code, it looks like the looping pattern here https://github.com/ml5js/ml5-next-gen/blob/main/src%2FBodyPose%2Findex.js#L312 would avoid the issue since it doesn't return the result of the promise, but I can try profiling the examples when I get back to my computer to be sure.
I was testing out this one https://editor.p5js.org/ml5/sketches/OukJYAJAb and taking heap snapshots every few minutes, and the snapshots seem to increase in size over time:
So that might mean there's a leak somewhere still? Here's the first and last snapshot for comparison: BEFORE-Heap-20240504T094341.heapsnapshot.zip AFTER-Heap-20240504T094350.heapsnapshot.zip
I see something kind of similar going on in the added Object
s. They don't appear as promises this time, but the last one in the chain does say "pending."
Thanks @davepagurek I moved this issue into the current repo for further discussion and tracking!
@davepagurek Thank you for reporting this issue!
I did some testing and found memory leaks in a few models. Here are the heap snapshots:
bodyPose
, with MoveNet
as the underlying model:
bodyPose
, with BlazePose
as the underlying model on mediapipe
runtime:
faceMesh
, on tfjs
runtime:
faceMesh
, on mediapipe
runtime:
This problem only seems to occur when using the mediapipe
runtime. I think the problem could be due to MediaPipe using WASM and C++(no automatic garbage collection). Even this tfjs demo of face-landmark-detection on MediaPipe runtime have a memory leak. I could not find much information about this, but here is a potentially related issue with a solution.
For now, switching to tfjs
runtime or MoveNet
on the new version of ml5 should fix the memory leak:
bodyPose = ml5.bodyPose("BlazePose", { runtime: "tfjs" });
or
bodyPose = ml5.bodyPose("MoveNet");
Hi everyone! I realize there's a new version of the library being developed so this bug doesn't actually have to be addressed, but I thought maybe the learnings of the cause might be good to know for future development.
β Step 1: Describe the issue π
So I was running PoseNet in a sketch being used in an art installation, intended to be running for days at a time. After a few hours on, it would mostly still run fast, but would drop a bunch of frames once in a while (at an accelerating pace, after being left on for e.g. 12 hours.) This shows up in a performance profile as a 1.2s(!!) garbage collection:
I compared some heap profiles over time to see what got added, and noticed that some of the added memory includes a promise, referenced by a promise, referenced by a promise, referenced by a promise...:
β Step 2: Screenshots or Relevant Documentation πΌ
I narrowed this down to the usage of promises in
multiPose
: https://github.com/ml5js/ml5-library/blob/f80f95aa6b31191ab7e79ff466e8506f5e48a172/src/PoseNet/index.js#L180-L182β Step 3: Share an example of the issue π¦
So, this code has the leak, using the automatic detection on the next frame, going through the promises above:
Meanwhile, this code does not seem to leak, by not initializing PoseNet with a video, and instead manually calling
multiPose
:So it seems like the problem was in fact the buildup of contexts in the promises. It tries to return the promise from
multiPose
, effectively doing an asynchronous infinite loop on itself, and I guess the browser keeps around the whole chain of promises due to the fact that it's beingreturn
ed instead of just awaited.Anyway, I just wanted to flag this in case the same pattern is being used elsewhere!
β Describe your setup π¦