nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.67k stars 29.63k forks source link

Flaky `test-stream-readable-unpipe-resume` #54133

Open targos opened 3 months ago

targos commented 3 months ago

Test

test-stream-readable-unpipe-resume

Platform

Linux x64

Console output

14:09:40 not ok 3695 parallel/test-stream-readable-unpipe-resume
14:09:40   ---
14:09:40   duration_ms: 120099.24600
14:09:40   severity: fail
14:09:40   exitcode: -15
14:09:40   stack: |-
14:09:40     timeout
14:09:40   ...

Build links

Additional information

No response

targos commented 3 months ago

Stress test: https://ci.nodejs.org/view/Stress/job/node-stress-single-test/520/

lpinca commented 1 month ago

I can reproduce on Ubuntu.

diff --git a/test/parallel/test-stream-readable-unpipe-resume.js b/test/parallel/test-stream-readable-unpipe-resume.js
index b40f724bcc..3eeae52f5f 100644
--- a/test/parallel/test-stream-readable-unpipe-resume.js
+++ b/test/parallel/test-stream-readable-unpipe-resume.js
@@ -13,7 +13,9 @@ const transformStream = new stream.Transform({
   })
 });

-readStream.on('end', common.mustCall());
+readStream.on('end', common.mustCall(function () {
+  console.log('end');
+}));

 readStream
   .pipe(transformStream)
$ ./tools/test.py --repeat=10000 test/parallel/test-stream-readable-unpipe-resume
=== release test-stream-readable-unpipe-resume ===
Path: parallel/test-stream-readable-unpipe-resume
end
Command: out/Release/node /home/luigi/node/test/parallel/test-stream-readable-unpipe-resume.js
--- TIMEOUT ---

[05:21|% 100|+ 9999|-   1]: Done

Failed tests:
out/Release/node /home/luigi/node/test/parallel/test-stream-readable-unpipe-resume.js

This is yet another case (see https://github.com/nodejs/node/issues/52550#issuecomment-2343204697) where the test correctly finishes (note "end" in the output) but the process does not exit.

targos commented 1 month ago

I wish we were able to reproduce these more reliably, because I don't know how to debug something that only happens once or twice every 10k runs!

lpinca commented 1 month ago

It might be only a coincidence or a reduced failure rate, but even in this case I get no failures with the --jitless flag.

RedYetiDev commented 3 weeks ago

Hypothetically, as a short-term solution, we could call process.exit() from these tests, but that acts like a bandaid, and not a patch.

- readStream.on('end', common.mustCall());
+ readStream.on('end', common.mustCall(() => process.exit()));
lpinca commented 3 weeks ago

@RedYetiDev it does not help, see https://github.com/nodejs/node/issues/52964#issuecomment-2106317993.

RedYetiDev commented 3 weeks ago

Insterestingly, it seems to not fail if you change the line to:

readStream.on('end', common.mustCall(() => queueMicrotask(process.exit)));

I ran 10,000 runs with this change: 0 failures I ran 2,000 runs without this change: 4 failures

Although this could be a coincidence?

lpinca commented 3 weeks ago

Although this could be a coincidence?

Yes, I think so. For example, this

diff --git a/test/parallel/test-net-write-fully-async-hex-string.js b/test/parallel/test-net-write-fully-async-hex-string.js
index 37b5cd75c1..b37f2acefc 100644
--- a/test/parallel/test-net-write-fully-async-hex-string.js
+++ b/test/parallel/test-net-write-fully-async-hex-string.js
@@ -30,3 +30,7 @@ const server = net.createServer(common.mustCall(function(conn) {
     writeLoop();
   }));
 }));
+
+server.on('close', function () {
+  queueMicrotask(process.exit);
+});

makes no difference.

RedYetiDev commented 3 weeks ago

This whole issue is strange. I wonder what task is hanging the process :thinking: