nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.93k stars 29.77k forks source link

test-error-serdes times out #52630

Open aduh95 opened 7 months ago

aduh95 commented 7 months ago

Test

test-error-serdes

Platform

Linux x64

Console output

not ok 1683 parallel/test-error-serdes
  ---
  duration_ms: 120121.49300
  severity: fail
  exitcode: -15
  stack: |-
    timeout
  ...

Build links

Additional information

No response

targos commented 7 months ago

It's possible V8 12.3 introduced it.

targos commented 7 months ago

I think I'm able to reproduce on my mac, but I got one timeout out of 1000 runs, which makes it difficult to debug.

RedYetiDev commented 7 months ago

I've gotten the timeout on several CI runs for #52509, so it's not system specific.

mhdawson commented 7 months ago

Again - https://ci.nodejs.org/job/node-test-commit-arm/52055/

joyeecheung commented 7 months ago

Report from https://github.com/nodejs/reliability/issues/842

Reason parallel/test-error-serdes
Type JS_TEST_FAILURE
Failed PR 26 (https://github.com/nodejs/node/pull/52584/, https://github.com/nodejs/node/pull/52365/, https://github.com/nodejs/node/pull/52545/, https://github.com/nodejs/node/pull/52573/, https://github.com/nodejs/node/pull/49709/, https://github.com/nodejs/node/pull/52592/, https://github.com/nodejs/node/pull/52505/, https://github.com/nodejs/node/pull/51050/, https://github.com/nodejs/node/pull/52611/, https://github.com/nodejs/node/pull/52435/, https://github.com/nodejs/node/pull/51340/, https://github.com/nodejs/node/pull/51657/, https://github.com/nodejs/node/pull/52616/, https://github.com/nodejs/node/pull/52625/, https://github.com/nodejs/node/pull/52347/, https://github.com/nodejs/node/pull/52465/, https://github.com/nodejs/node/pull/52595/, https://github.com/nodejs/node/pull/52108/, https://github.com/nodejs/node/pull/52372/, https://github.com/nodejs/node/pull/52624/, https://github.com/nodejs/node/pull/52164/, https://github.com/nodejs/node/pull/52577/, https://github.com/nodejs/node/pull/52645/, https://github.com/nodejs/node/pull/52047/, https://github.com/nodejs/node/pull/52509/, https://github.com/nodejs/node/pull/52632/)
Appeared test-digitalocean-ubuntu2204_sharedlibs_container-x64-6, test-ibm-ubuntu2204_sharedlibs_container-x64-3, test-digitalocean-ubuntu2204_sharedlibs_container-x64-9, test-digitalocean-ubuntu2204_sharedlibs_container-x64-8, test-digitalocean-ubuntu2204_sharedlibs_container-x64-10, test-digitalocean-ubuntu2204_sharedlibs_container-x64-1, test-ibm-ubuntu2204_sharedlibs_container-x64-5, test-ibm-ubuntu2204_sharedlibs_container-x64-1, test-osuosl-ubuntu2004_container-armv7l-1, test-orka-macos11-x64-2, test-digitalocean-alpine319_container-x64-2, test-azure_msft-win11_vs2022-x64-4, test-digitalocean-alpine319_container-x64-1, test-ibm-ubi81_container-x64-1, test-equinix-debian11_container-armv7l-2, test-equinix-ubuntu2004_container-armv7l-1, test-digitalocean-ubuntu2204_sharedlibs_container-x64-4, test-digitalocean-rhel9-x64-1, test-digitalocean-alpine318_container-x64-1, test-digitalocean-ubuntu2204_sharedlibs_container-x64-3, test-ibm-rhel8-x64-1, test-digitalocean-ubuntu2204_sharedlibs_container-x64-5, test-ibm-ubuntu2204_sharedlibs_container-x64-2, test-ibm-ubuntu2204_sharedlibs_container-x64-4, test-digitalocean-ubuntu2204_sharedlibs_container-x64-7, test-equinix-rhel8_container-arm64-1, test-equinix-ubuntu2004_container-arm64-2, test-orka-macos11-x64-1, test-osuosl-ubuntu2004_sharedlibs_container-arm64-1
First CI https://ci.nodejs.org/job/node-test-pull-request/58526/
Last CI https://ci.nodejs.org/job/node-test-pull-request/58617/
Example ``` not ok 1397 parallel/test-error-serdes --- duration_ms: 120079.51800 severity: fail exitcode: -15 stack: |- timeout ... ```
joyeecheung commented 7 months ago

On container builds where this does pass, the test finishes in 3-5 seconds, so this is likely a bug with V8.

joyeecheung commented 7 months ago

Looking at other recent flakes that are at the top of the list, it seems the V8 update introduced quite a lot of timeouts, it could be some kind of deadlock.

joyeecheung commented 7 months ago

Another clue - I don't see this flaking in the V8 CI (Linux x64 only), so it should be our bug.

LiviaMedeiros commented 7 months ago

Not sure what's the root cause, but it could be that the update made exceeding the max stack size less stable. https://github.com/nodejs/node/blob/708bffa9992d65fbe063c6a2f30892b8b5c0af57/test/parallel/test-error-serdes.js#L88-L93 When I run this part with Node.js v20.12.1, I observe pretty much reproducible depth === 1542. When I run it with current main build, the cycle ends with fluctuating numbers like 1717, 1649, etc. Increasing stack size leads to exponential increase in time, e.g. node --stack-size=4096 test/parallel/test-error-serdes.js easily exceeds 120 seconds for me (ulimit -s is 8192). Maybe under some "lucky" conditions this cycle can go abnormally deep?