Open mhdawson opened 5 months ago
hmm, maybe we can add an unref'ed timer in the test that logs out the active handles after around 1 minute to figure out what's keeping the process alive..
I'm unable to reproduce locally on macOS:
$ ./tools/test.py --repeat 9999 -t 2 test/parallel/test-http2-large-write-multiple-requests.js
[05:07|% 100|+ 9999|- 0]: Done
All tests passed.
I can reproduce on Ubuntu
$ ./tools/test.py --repeat=10000 test/parallel/test-http2-large-write-multiple-requests.js
=== release test-http2-large-write-multiple-requests ===
Path: parallel/test-http2-large-write-multiple-requests
...
Command: out/Release/node /home/luigi/node/test/parallel/test-http2-large-write-multiple-requests.js
--- TIMEOUT ---
...
[05:05|% 99|+ 9996|- 3]: Done
Failed tests:
out/Release/node /home/luigi/node/test/parallel/test-http2-large-write-multiple-requests.js
out/Release/node /home/luigi/node/test/parallel/test-http2-large-write-multiple-requests.js
out/Release/node /home/luigi/node/test/parallel/test-http2-large-write-multiple-requests.js
Similarly to parallel/test-net-write-fully-async-hex-string
, the test correctly finishes but the process does not exit. The following patch
diff --git a/test/parallel/test-http2-large-write-multiple-requests.js b/test/parallel/test-http2-large-write-multiple-requests.js
index bcbb1434cb..b5a698e47f 100644
--- a/test/parallel/test-http2-large-write-multiple-requests.js
+++ b/test/parallel/test-http2-large-write-multiple-requests.js
@@ -1,4 +1,5 @@
'use strict';
+// Flags: --jitless
const common = require('../common');
if (!common.hasCrypto)
common.skip('missing crypto');
fixes the issue for both tests on my machine.
I'm able to relatively quickly reproduce on https://ci.nodejs.org/computer/test%2Dibm%2Drhel8%2Dx64%2D3/
After attaching gdb to the hanging process, I get this:
Attaching to program: /home/iojs/build/workspace/node-test-commit-linux/node, process 3703811
[New LWP 3703812]
[New LWP 3703813]
[New LWP 3703814]
[New LWP 3703815]
[New LWP 3703816]
[New LWP 3703817]
[New LWP 3703825]
[New LWP 3703826]
[New LWP 3703827]
[New LWP 3703828]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f1be59dc48c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-251.el8_10.2.x86_64 libgcc-8.5.0-22.el8_10.x86_64 libstdc++-8.5.0-22.el8_10.x86_64
(gdb) bt
#0 0x00007f1be59dc48c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000001fea079 in uv_cond_wait (cond=<optimized out>, mutex=<optimized out>) at ../deps/uv/src/unix/thread.c:814
#2 0x000000000105d48b in node::NodePlatform::DrainTasks(v8::Isolate*) ()
#3 0x0000000000eb0dc2 in node::SpinEventLoopInternal(node::Environment*) ()
#4 0x000000000101f0b2 in node::NodeMainInstance::Run() ()
#5 0x0000000000f71782 in node::Start(int, char**) ()
#6 0x00007f1be56327e5 in __libc_start_main () from /lib64/libc.so.6
#7 0x0000000000eade0e in _start ()
I'm starting to believe that all these random timeouts are related to these: https://github.com/nodejs/node/pull/47450 https://github.com/nodejs/node/issues/47297 https://github.com/nodejs/node/pull/47452 https://github.com/nodejs/node/pull/47461
In other words, a revert of e600de93cf443f057bd6d1135d1768ba5a39d110 might make the issues much easier to reproduce on fast CPUs
FWIW, I get no failures (for this test) with the patch from https://github.com/nodejs/node/pull/47452 applied on main.
I confirm. Let's reopen that pull request.
The are a bunch of other tests that fail with that patch, though.
Test
parallel/test-http2-large-write-multiple-requests
Platform
Linux x64
Console output
Build links
Additional information
No response