nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.34k stars 29.47k forks source link

Assertion failed in 12.16.0 #31752

Closed arcanis closed 4 years ago

arcanis commented 4 years ago

What steps will reproduce the bug?

git clone git@github.com:yarnpkg/berry
yarn build:cli
yarn test:integration pnp.test.js

How often does it reproduce? Is there a required condition?

It seemed to crash consistently across all three systems. The same tests are always failing.

Command failed: /Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node /Users/runner/runners/2.164.0/work/berry/berry/packages/yarnpkg-cli/bundles/yarn.js install
/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node[2428]: ../src/node_platform.cc:243:virtual void node::PerIsolatePlatformData::PostTask(std::unique_ptr<Task>): Assertion `(flush_tasks_) != nullptr' failed.
 1: 0x100080dde node::Abort() [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
 2: 0x100080b86 node::AppendExceptionLine(node::Environment*, v8::Local<v8::Value>, v8::Local<v8::Message>, node::ErrorHandlingMode) [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
 3: 0x1000d4815 node::PerIsolatePlatformData::PostTask(std::__1::unique_ptr<v8::Task, std::__1::default_delete<v8::Task> >) [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
 4: 0x10072a464 v8::internal::wasm::WasmEngine::TriggerGC(signed char) [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
 5: 0x100729e66 v8::internal::wasm::WasmEngine::AddPotentiallyDeadCode(v8::internal::wasm::WasmCode*) [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
 6: 0x10071becc v8::internal::wasm::WasmCode::DecRefOnPotentiallyDeadCode() [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
 7: 0x10071bf63 v8::internal::wasm::WasmCode::DecrementRefCount(v8::internal::Vector<v8::internal::wasm::WasmCode* const>) [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
 8: 0x10051e057 std::__1::__shared_ptr_emplace<v8::internal::wasm::GlobalWasmCodeRef, std::__1::allocator<v8::internal::wasm::GlobalWasmCodeRef> >::__on_zero_shared() [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
 9: 0x7fff65344c22 std::__1::__shared_weak_count::__release_shared() [/usr/lib/libc++.1.dylib]
10: 0x10051df7c v8::internal::Managed<v8::internal::wasm::GlobalWasmCodeRef>::Destructor(void*) [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
11: 0x1002baa57 v8::internal::Isolate::Deinit() [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
12: 0x1002ba84d v8::internal::Isolate::Delete(v8::internal::Isolate*) [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
13: 0x1000b669f node::NodeMainInstance::~NodeMainInstance() [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
14: 0x10005dd9f node::Start(int, char**) [/Users/runner/hostedtoolcache/node/12.16.0/x64/bin/node]
15: 0x7fff6821e7fd start [/usr/lib/system/libdyld.dylib]

What is the expected behavior?

Tests should pass like here (built yesterday with Node 12.15.0): https://github.com/yarnpkg/berry/runs/438612900

What do you see instead?

Tests are failing (built just now with Node 12.16.0): https://github.com/yarnpkg/berry/runs/441525249

Hakerh400 commented 4 years ago

Bisected to https://github.com/nodejs/node/commit/65e5a8a90cd85fc4d9036bc4b54e0905edd8b51a

addaleax commented 4 years ago

Ugh … I’m pretty sure that’s a V8 bug, and I’ll see if I can find anything on it, but I guess it would be fine to flip the order of the calls around again only for the main instance as a temporary workaround.

addaleax commented 4 years ago

@arcanis It’s hard to actually get access to a failing process, and since you’re most likely more familiar with yarn’s internals than me, I figured I’d ask:

As part of the output of your reproduction, each crash backtrace is preceded by information like this:

Temporary fixture folder: /tmp/tmp-20302k5Rq7QEVOdFX

Command failed: /home/xxxx/.nvm/versions/node/v12.16.0/bin/node /tmp/berry/packages/yarnpkg-cli/bundles/yarn.js install

However, running that command in that directory doesn’t seem to lead to a crash… any suggestions?

arcanis commented 4 years ago

I think I managed to make a basic repro! Clone the repository, then in one tab:

yarn build:cli
yarn test:integration pnp.test.js -t 'it should not cache the postinstall artifacts'
yarn node ./packages/acceptance-tests/test-server.js

This should give you two things: a failing test directory (similar to the one you noticed), and the url for a mock registry (http://localhost:<port>). Keeping the server running, open another tab, cd into the failing test directory, then run the following (update the path to Node and Yarn as needed):

export YARN_NPM_REGISTRY_SERVER=http://localhost:53994
export YARN_GLOBAL_FOLDER=/tmp/.yarn/global
export YARN_UNSAFE_HTTP_WHITELIST=localhost

rm -rf .yarn yarn.lock && /Users/mael.nison/.fnm/node-versions/v12.16.0/installation/bin/node /Users/mael.nison/berry/packages/yarnpkg-cli/bundles/yarn.js install

This should reproduce the problem. You can run the rm -rf ... && node part as many times as you need, it should keep reproing.

addaleax commented 4 years ago

@arcanis I’m afraid that doesn’t quite work for me (x64 Linux, Ubuntu 19.10), partly because this seems flaky and running yarn test:integration pnp.test.js doesn’t always fail with the same tests. :/

This is definitely related to WASM usage, which might make it a bit easier to figure out what is causing this. I don’t think fully understanding the issue would be required for a fix, but it would be really nice to have a regression test for this…

addaleax commented 4 years ago

Fwiw, I’ve opened a V8 CL in https://chromium-review.googlesource.com/c/v8/v8/+/2061548 and it currently looks like there’s actually no correct ordering of disposing an Isolate and detaching it from the platform, either one will have a race condition. Let’s see how the conversation there goes. :grimacing: