nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.82k stars 29.71k forks source link

`yield*` does not free the latest reference #55585

Open oleasteo opened 2 weeks ago

oleasteo commented 2 weeks ago

Version

v23.1.0

Platform

Linux zerg 6.11.5-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 22 Oct 2024 18:31:38 +0000 x86_64 GNU/Linux

Subsystem

No response

What steps will reproduce the bug?

When using the yield* operator, it seems a reference to the latest yielded value is kept internally, preventing the garbage collector from freeing the memory.

Test example:

const refs = [];
const iterator = outerGen();
// const iterator = innerGen();

await iterator.next();
await iterator.next();
iterator.next();

await sleep(0);
global.gc();

console.assert(refs.length === 2, "should have two refs");
console.assert(refs[0].deref() === undefined, "first ref should be freed");
console.assert(refs[1].deref() === undefined, "second ref should be freed");

async function* outerGen() {
  yield* innerGen();
}

async function* innerGen() {
  let data;
  while (true) {
    await sleep(100);
    data = {}; // some dummy data
    refs.push(new WeakRef(data));
    yield data;
    data = null;
  }
}

function sleep(ms) {
  return new Promise((resolve) => setTimeout(resolve, ms));
}

When using const iterator = outerGen(); as shown above, the "second ref should be freed" assertion fails. When using const iterator = innerGen(); directly, it passes.

How often does it reproduce? Is there a required condition?

Reliable. Use reproduction snippet above.

What is the expected behavior? Why is that the expected behavior?

I would expect yield* not to hold any internal reference to the values longer than necessary.

Interesting as well, but maybe unrelated: If I use const data in innerGen() as following, the data isn't freed either during the await sleep(100); of the following iteration.

async function* innerGen() {
  while (true) {
    await sleep(100);
    const data = {};
    refs.push(new WeakRef(data));
    yield data;
  }
}

What do you see instead?

yield* prevents the garbage collector from freeing the latest value until the next value is reached.

Additional information

No response

aduh95 commented 2 weeks ago

Did you try to reproduce with d8 to see if it’s a V8 bug or if it’s Node.js specific?

F3n67u commented 2 weeks ago

If we increase the sleep duration before global.gc(), then the "second ref should be freed" assertion passes. This suggests that yield* doesn’t prevent garbage collection of second ref data but simply delays it.

--- a/weakref-repro.mjs
+++ b/weakref-repro.mjs
@@ -6,10 +6,10 @@ await iterator.next();
 await iterator.next();
 iterator.next();

-await sleep(0);
+await sleep(1000);
 global.gc();

-console.assert(refs.length === 2, "should have two refs");
+// console.assert(refs.length === 2, "should have two refs");
 console.assert(refs[0].deref() === undefined, "first ref should be freed");
 console.assert(refs[1].deref() === undefined, "second ref should be freed");
oleasteo commented 2 weeks ago

@aduh95 I have not tested with d8 yet. Will try to do so later today.

oleasteo commented 2 weeks ago

@F3n67u not quite. The threshold is always the same duration as within the innerGen sleep(...) call. That's why I added that assertion refs.length === 2. The test calls iterator.next() three times to make sure the innerGen loop continues after the yield data of the second call and should theoretically free the data reference before getting to the third await sleep. If refs.length === 3 already, the third ref will not have been freed instead, shifting the issue.

oleasteo commented 2 weeks ago

I've created a test case for d8 and verified that it's not specific to node. Thus, reported in v8: https://issues.chromium.org/u/2/issues/376881544 (and https://issues.chromium.org/u/3/issues/378234614 for the mentioned similar loop-scope issue)

Feel free to close the issue or keep it for tracking; I don't know your process ^^