nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
104.76k stars 28.3k forks source link

[Regression in 14.1.0 - Windows] `stdout` is sometimes empty #33166

Closed nicolo-ribaudo closed 2 years ago

nicolo-ribaudo commented 4 years ago

Bug description

After the Node.js 14.1.0 release, Babel's CI tests on Windows started failing. The failures are all related to @babel/cli and @babel/node: we run those CLI programs, capture their output (from stdout) and compare it with the expected output.

Sometimes, the generated stdout is empty: you can see an example here or here. In this CI log you can also see some error messages coming from V8 internals in @babel/node tests, but I don't know if it's the same problem.

How often does it reproduce? Is there a required condition?

My guess is that every test has about 1% chance of failing. However, we have ~100 tests for those 2 packages so something is failing more often than not.

What steps will reproduce the bug?

I couldn't find a small and isolated reproduction example. I'll keep trying to create one, but here is what I have for now. Also, building Babel on Windows is painful and I couldn't run the full test suite. However, I managed to reproduce the bug.

I'm running these commands using Powershell, Nodej.s 14.1.0 and Yarn 1.22

# Clone the babel repository
git clone https://github.com/babel/babel.git .

# Install deps
yarn
yarn lerna bootstrap # This takes a while

# Build Babel (this won't build everything, but enough packages to show the bug)
yarn gulp build-no-bundle

# Run the @babel/cli and @babel/node tests
yarn jest --runInBand babel-node babel-cli

If you don't see the last command failing, try running it 2 or three times. The --runInBand option isn't necessary to reproduce the problem, but it disables Jest's workers so it removes one possible cause.

What is the expected behavior?

Tests should pass

What do you see instead?

An example of output failing locally is this:

PS C:\Users\Nicolò\Downloads\babel-master\babel-master> yarn jest --runInBand babel-node babel-cli
yarn run v1.22.4
$ C:\Users\Nicolò\Downloads\babel-master\babel-master\node_modules\.bin\jest --runInBand babel-node babel-cli
 FAIL  packages/babel-cli/test/index.js (33.099s)
  ● bin/babel › --only glob

    "C:\Users\Nicolò\Downloads\babel-master\babel-master\packages\babel-cli\lib\babel" "--presets" "C:\Users\Nicolò\Downloads\babel-master\babel-master\packages\babel-preset-react" "--plugins" "C:\Users\Nicolò\Downloads\babel-master\babel-master\packages\babel-plugin-transform-arrow-functions,C:\Users\Nicolò\Downloads\babel-master\babel-master\packages\babel-plugin-transform-strict-mode,C:\Users\Nicolò\Downloads\babel-master\babel-master\packages\babel-plugin-transform-modules-commonjs" "src" "--out-dir" "lib" "--only" "**/*.foo.js" "--verbose": expect(received).toBe(expected) // Object.is equality

    Expected: "src/a.foo.js -> lib/a.foo.js
    src/baz/b.foo.js -> lib/baz/b.foo.js
    Successfully compiled 2 files with Babel."
    Received: ""

      85 |       expect(stdout).toContain(expectStdout);
      86 |     } else {
    > 87 |       expect(stdout).toBe(expectStdout);
         |                      ^
      88 |     }
      89 |   } else if (stdout) {
      90 |     throw new Error("stdout:\n" + stdout);

      at assertTest (packages/babel-cli/test/index.js:87:22)
      at ChildProcess.<anonymous> (packages/babel-cli/test/index.js:152:9)

 PASS  packages/babel-node/test/index.js (22.924s)

Test Suites: 1 failed, 1 passed, 2 total
Tests:       1 failed, 89 passed, 90 total
Snapshots:   0 total
Time:        56.088s, estimated 62s
Ran all test suites matching /babel-node|babel-cli/i.
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

Additional information

lpinca commented 4 years ago

cc: @nodejs/streams @ronag

mcollina commented 4 years ago

Can you confirm this happening only on Windows? This is widely unexpected.

ronag commented 4 years ago

I can't reproduce on OSX. @nicolo-ribaudo any chance you could bisect? I don't have a Windows machine at the moment.

ronag commented 4 years ago

I'm unsure where to go with the V8 errors. @addaleax maybe?

nicolo-ribaudo commented 4 years ago

Yes, I can only reproduce this on windows (not the V8 errors), only the empty stdouts. I'll try to bisect.

addaleax commented 4 years ago

@ronag I’m not sure… Stack traces on Windows are basically garbage unless you take some extra steps (that I don’t know how to take), and given that the error message isn’t particularly helpful here either – there are over 2000 instances of UNREACHABLE(); in V8 – I’m not quite sure where to start here.

/cc @nodejs/v8 @nodejs/platform-windows

nicolo-ribaudo commented 4 years ago

Update: this might not be a Node.js bug.

Today I made two local clones of the Babel repository: one ~6 hours ago and I can consistently reproduce the errors there, and one ~2 hours ago where I cannot reproduce the errors. This makes me think that maybe it's a bug in a dependency that didn't support Node.js 14.1.0 on windows, and it was recently updated to fix the bug.

Even if there was a bug in a dependency I don't think that it should trigger the V8 internal error, but I cannot reproduce that one anyway :shrug:

I'll try to re-run the build on Travis CI a bunch of times, and see if I can reproduce it there.

ronag commented 4 years ago

This makes me think that maybe it's a bug in a dependency that didn't support Node.js 14.1.0 on windows, and it was recently updated to fix the bug.

Can you diff the dependencies?

nicolo-ribaudo commented 4 years ago

Yeah I'm trying. I'm on a Windows VM and I haven't used Windows for a while, so it might take some time :stuck_out_tongue:

Also, another thing I noticed: in the folder where I can (consistently!) reproduce the problem, I can only reproduce it when using Node.js 14.1.0. If I use a build form master (4b2d95804b7c064ff94609f28f69098c8f6d4d31) it doesn't fail.

ronag commented 4 years ago

If I use a build form master (4b2d958) it doesn't fail.

Yea, a bisect would be very useful.

nicolo-ribaudo commented 4 years ago

I gave up for today, I will continue tomorrow. Also, I noticed that Jest's own tests have the same problem: https://github.com/facebook/jest/pull/9934/checks?check_run_id=633574558 (cc @SimenB)

dr-js commented 4 years ago

Not sure if it's related, I got win10 + nodejs@14 + babel on CI with flaky ELIFECYCLE/3221225477 exit code in one of my repo: https://github.com/nodejs/help/issues/2660

Update:

I've reduced CI exit test case to some of my source files and babel + 2 babel plugins The code is at: https://github.com/dr-js/dr-dev/tree/8e111ec5194adc3e159db2d3bc3594f8587dc230 The latest CI fail: https://github.com/dr-js/dr-dev/actions/runs/92784081

Update 2:

With more debug log found the exit happen before the @babel/env plugin gets load, after minify-replace and module-resolver loaded. Related CI run: https://github.com/dr-js/dr-dev/actions/runs/92845322

SimenB commented 4 years ago

Yeah, Jest is failing often on Node 14.1.0 on Windows with empty stdout in one test or the other almost every test run. The V8 thing is actually in the latest build on master: https://github.com/facebook/jest/runs/634830924

nicolo-ribaudo commented 4 years ago

Ok, have no idea where to start bisecting. I can still consistently reproduce the failures with the Node.js 14.1.0 installed with the installer, but not if I build https://github.com/nodejs/node/commit/9ffd8e7a7414052834bfedb0daba68d5c8363884.

mcollina commented 4 years ago

Ok, have no idea where to start bisecting. I can still consistently reproduce the failures with the Node.js 14.1.0 installed with the installer, but not if I build 9ffd8e7.

@nodejs/releasers can you check if there could be any differences between what is available via the installer and what was tagged?

targos commented 4 years ago

I think there's no way to be sure it's the promoted one, but I can confirm that the latest release build was done with commit https://github.com/nodejs/node/commit/9ffd8e7a7414052834bfedb0daba68d5c8363884 Refs: https://ci-release.nodejs.org/job/iojs+release/5995/ The times of the file on the server and in the build logs also match.

mcollina commented 4 years ago

@jasnell I know you have a beefy windows machine, could you check this out and run a bisect?

Yeah, Jest is failing often on Node 14.1.0 on Windows with empty stdout in one test or the other almost every test run. The V8 thing is actually in the latest build on master: https://github.com/facebook/jest/runs/634830924

@SimenB This is worrysome. Are you using any wasm inside Jest?

targos commented 4 years ago

I'm building the release commit on my machine (Win10, VS2019)

SimenB commented 4 years ago

@mcollina nope, only experimental stuff we ship is the VM ESM APIs, but that code is guarded behind a check that vm.SyntheticModule is defined, which is flagged.

EDIT: That said, some dependency might? Seems unlikely, though - node 14.0 behaved fine, and both mac and linux on 14.1 behaves fine

targos commented 4 years ago

Well, we'd need some more simple and reliable way to reproduce this. I cloned Jest and tried to run e2e\__tests__\toThrowErrorMatchingInlineSnapshot.test.ts multiple times with my local build and the release but it never errors.

mcollina commented 4 years ago

That said, some dependency might? Seems unlikely, though - node 14.0 behaved fine, and both mac and linux on 14.1 behaves fine

The V8 stacktrace you posted shows a failure within some code in the wasm namespace in V8.

Hopefully @targos can shed some light on this.

targos commented 4 years ago

Commits between 14.0.0 and 14.1.0: https://github.com/nodejs/node/compare/v14.0.0...v14.1.0 3 of them to deps/v8:

nicolo-ribaudo commented 4 years ago

I'm now trying to remove as much code as possible from my local failing clone of Babel to reduce the number of possible error causes.

mcollina commented 4 years ago

This seems to be not babel specific: https://github.com/pinojs/pino/runs/642073186. Again, this is consistently flaky :(.

@targos did you run the tests in Linux WSL or using the normal bin? I have a hunch that Github CI is using WSL.

targos commented 4 years ago

I used the normal bin in PowerShell

mcollina commented 4 years ago

Would you mind to try out if it works in WSL?

targos commented 4 years ago

@mcollina what exactly do you want me to try? BTW I cloned pino, npm i, npm t with Node.js 14.1.0 on Windows and all tests pass 😕

mcollina commented 4 years ago

Can you do the same inside WSL?

targos commented 4 years ago

No problem inside WSL either.

I think this may be only reproducible if stdout is not a terminal. Is there a simple way to test this on Windows?

SimenB commented 4 years ago

Note that which test fails on Jest's CI varies, and not all test runs even have a failure. So I think this might be super hard to nail down a consistent reproduction for 😟

mcollina commented 4 years ago

@nicolo-ribaudo given that you are the only one with a repro on your development machine, could you try to isolate which commit is causing the problem (e.g. bisect)?

nicolo-ribaudo commented 4 years ago

I can try again, but if I build the release commit with .\vcbuild I can't reproduce the issue.

mcollina commented 4 years ago

I can try again, but if I build the release commit with .\vcbuild I can't reproduce the issue.

Oh sorry, then don't bother :(. So this only happens with the binary we have built.

nicolo-ribaudo commented 4 years ago

Maybe in 14.1.1, with a new binary, it will magically fix itself 🤷‍♂️

mcollina commented 4 years ago

@nicolo-ribaudo are you running the tests in WSL or cmd or powershell? Neither @targos nor myself can repro them.

nicolo-ribaudo commented 4 years ago

I'm using powershell. I don't know if it's possible/legal, but I can try to share my virtualbox image with you.

BethGriggs commented 4 years ago

I ran a test rebuild of the v14.1.0 release commit 9ffd8e7a74 on the Node.js infrastructure - this could rule out whether anything obscure went wrong with the published release build.

https://nodejs.org/download/test/v14.1.0-test202005049ffd8e7a74/

mcollina commented 4 years ago

@nicolo-ribaudo do you know how I can run jest with custom Node runtime on Windows?

nicolo-ribaudo commented 4 years ago

I used path/to/node.exe ./node_modules/jest/bin/jest babel-node babel-cli.

@BethGriggs I'll check it later, thanks!

jasnell commented 4 years ago

I don't know if it's relevant to this specific issue, but I did notice that between 13.11 and and 13.12, when any stream piping to process.stdout or process.stderr closes, process.stdout/process.stderr end up closing allowing no further output. This behavior is carried over into 14.x and is on all platforms...

To give it a quick test, nvm use 13.11 and try...

$ node
Welcome to Node.js v13.11.0
Type ".help" for more information.
> const w = new stream.PassThrough()
undefined
> stream.pipeline(w, process.stdout, () => {})
<truncated>
> w.end('hello')
<truncated>

Then nvm use 13.12 and do the same thing.

You'll see that on 13.11, process.stdout is still usable while on anything beyond 13.12, it is not.

/cc @ronag

nicolo-ribaudo commented 4 years ago

@jasnell I don't think that it is related, because the problem reported in this issue started appearing in 14.1.0

nicolo-ribaudo commented 4 years ago

I tested the test release and I could still reproduce the bug. @BethGriggs is there any flag I should use to produce a release-like build locally?


Also, I tried to think about what my env has in common with CI servers, to understand why I'm the only one that can reproduce the bug. The only thing I noticed is that in this VM I have a very low available disk space (1.5GB), and it might similar to CI.

mcollina commented 4 years ago

I can replicate the error by running babel tests with the shipped release. I cannot replicate it by building manually the release commit.

I can confirm there is something odd happening here.

BethGriggs commented 4 years ago

For Windows builds our Release CI calls vcbuild.bat build-release x64 upload. This is building on Windows 2012 R2 (Rackspace) with Visual Studio 2019.

11:46:03 c:\ws>vcbuild.bat build-release x64 upload 
11:46:03 Looking for Python
11:46:03 Python found in C:\Python27\\python.exe
11:46:03 Looking for NASM
11:46:03 Looking for Visual Studio 2019
11:46:03 Looking for WiX installation for Visual Studio 2019...
11:46:03 calling: "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\\Auxiliary\Build\vcvarsall.bat" amd64
11:46:04 **********************************************************************
11:46:04 ** Visual Studio 2019 Developer Command Prompt v16.5.4
11:46:04 ** Copyright (c) 2019 Microsoft Corporation
11:46:04 **********************************************************************
11:46:04 [vcvarsall.bat] Environment initialized for: 'x64'
11:46:04 Found MSVS version 16.0
11:46:04 configure  --with-ltcg --release-urlbase=https://nodejs.org/download/test/ "--download=all" --with-intl=full-icu --dest-cpu=x64 --tag=test202005049ffd8e7a74
11:46:11 Node.js configure: Found Python 2.7.18...
11:46:11 Warning: Missing input files:
11:46:11 tools\msvs\genfiles\node_etw_provider.rc
11:46:11 tools\msvs\genfiles\node_etw_provider.h
11:46:11 INFO: configure completed successfully
JLHwung commented 4 years ago

@jasnell FYI the babel test error is not reproducible on both node.js 13.12 and 13.13: https://travis-ci.com/github/JLHwung/babel/builds/163828049

jasnell commented 4 years ago

Unfortunately that's what I suspected :-(

nimit95 commented 4 years ago

@mcollina can you please try moving the commit built binary to a different folder and try running the tests again?

mcollina commented 4 years ago

@mcollina can you please try moving the commit built binary to a different folder and try running the tests again?

Same


@BethGriggs I am using Visual Studio 2017. I'm trying to run vcbuild.bat build-release x64 but it cannot find Wix. I do not know how to install that :/, there are no guides linked in BUILDING.md.

targos commented 4 years ago

I don't think build-release creates a different node.exe but I might be wrong. Wix is needed to create the .msi installer but that's separate from creating node.exe

bzoz commented 4 years ago

build-release enables LTCG