Open anonrig opened 1 week ago
I have raised this one in openjs flaky ci slack channel and @Qard has responded positively ๐ I think he would take a look soon.
Yeah, it appears to be a segfault (access violation) that appears to only happen on windows as far as I can determine. Will need a windows machine to investigate further. I was planning to dig in more next week on this one as I need to break out the Windows machine and update it ;-)
FWIW, this appears to be a test-runner issue, possibly with the interaction with the subtests? It seems to always happen with the misread that... actually, I definitely think it's an issue with the test runner and the way this test is structured. Going to try an experiment that splits these out into individual testslocal-storage-nested
subtest but that test does not exercise the processTicksAndRejections
path where the access violation appears to be happening.
FWIW, this appears to be a test-runner issue
Are you referring to the Python runner or the JS one? I'm assuming you mean Python in this case, but just wanted to verify.
Not sure yet. I think maybe the python runner but can't be certain. I'm running an experiment in #54818 that removes both the node:test runner and the python test runner from the mix to see if we still see flakiness on these. If that looks good, I want to next try with the python runner but no node-test
runner, following the pattern of the original test in spawning a bunch of these processes in parallel to see if we still have flakiness. Unfortunately I was not able to reproduce the failure locally at all on a quick test on my old windows machine so I'm going a bit by trial-an-error using CI.
16:17:18 test at test\parallel\test-async-context-frame.mjs:48:5
16:17:18 รขลโ async-hooks\test-async-local-storage-socket.js (5107.2542ms)
16:17:18 AssertionError [ERR_ASSERTION]: Test async-hooks\test-async-local-storage-socket.js failed with exit code 3221225477
16:17:18 at TestContext.<anonymous> (file:///C:/workspace/node-test-binary-windows-js-suites/node/test/parallel/test-async-context-frame.mjs:58:7)
16:17:18 at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
16:17:18 at async Test.run (node:internal/test_runner/test:888:9)
16:17:18 at async Promise.all (index 14)
16:17:18 at async Suite.run (node:internal/test_runner/test:1268:7)
16:17:18 at async startSubtestAfterBootstrap (node:internal/test_runner/harness:280:3) {
16:17:18 generatedMessage: false,
16:17:18 code: 'ERR_ASSERTION',
16:17:18 actual: 3221225477,
16:17:18 expected: 0,
16:17:18 operator: 'strictEqual'
16:17:18 }
The line throwing that is:
strictEqual(code, 0, `Test ${test} failed with exit code ${code}`);
The code
is an exit code of a child process running the Python runner. I'd be surprised if that was related to node:test.
I would be too! But just going through due diligence! It's also possible this isn't a problem with the tests at all and something wrong with the windows host running the tests! Just need to go through the steps to eliminate possible causes and narrow it down. node:test
is my least likely candidate tho. Yeah, sorry I wasn't clear, when I said test runner I meant the python test runner along with possibly the spawning of multiple instances running in parallel.
I've been exploring all the results of the reported failures and they seem to fail and many different points in the run. It has a slight bias toward some particular tests but that could be a timing thing and not actually indicative of any failure of the code itself. I think I agree with @jasnell here that something seems off with running multiple python test runners in parallel. Might be worth just disabling (or reducing?) the parallelism and see how that goes?
Test
test-async-context-frame
Platform
Windows
Console output
Build links
Additional information
No response