node-ffi-napi / weak-napi

Make weak references to JavaScript Objects.
ISC License
45 stars 4 forks source link

Segfault on process shutdown with vm.createContext #31

Open FauxFaux opened 3 years ago

FauxFaux commented 3 years ago

This code causes node to segfault on exit.

function run() {
  let vm = require('vm');
  let ctx = vm.createContext();
  let scr = new vm.Script('({})');
  require('weak-napi')(ctx, () => console.log('freed'));
  scr.runInContext(ctx);
}

run();
ketchup% npm i weak-napi && node alpha.js 
zsh: segmentation fault (core dumped)  node alpha.js

Sometimes(:tm:) you can fix this by adding global.gc() before the script exits, but that's defeatable, e.g. with this, which segfaults again:

let vm = require('vm');
let ctx = vm.createContext();
let scr = new vm.Script('({"bar":function(require){require("https").get = () => {};}})');
scr.runInContext(ctx).bar(require);
require('weak-napi')(ctx, () => console.log('freed'))
vm = null; ctx = null; scr = null;
global.gc();

The backtrace is meaningless to me; it's trying to clean up the environment and it's trying a double-free. Ooh, I wonder if ASAN will catch this? Note that the backtrace is way worse on older nodes.

ketchup% gdb --args ~/clone/node/out/Debug/node alpha.js
...
Thread 1 "node" received signal SIGSEGV, Segmentation fault.
0x000055837b4de4b2 in v8impl::(anonymous namespace)::RefBase::Delete (reference=0x558381065540) at ../src/js_native_api_v8.cc:248
248       delete reference;
(gdb) bt
#0  0x000055837b4de4b2 in v8impl::(anonymous namespace)::RefBase::Delete (reference=0x558381065540) at ../src/js_native_api_v8.cc:248
#1  v8impl::(anonymous namespace)::RefBase::Finalize (this=0x558381065540, is_env_teardown=<optimised out>)
    at ../src/js_native_api_v8.cc:281
#2  0x000055837b4feb4c in v8impl::RefTracker::FinalizeAll (list=0x558381109d08) at ../src/js_native_api_v8.h:43
#3  napi_env__::~napi_env__ (this=0x558381109cd0, __in_chrg=<optimised out>) at ../src/js_native_api_v8.h:66
#4  node_napi_env__::~node_napi_env__ (this=0x558381109cd0, __in_chrg=<optimised out>) at ../src/node_api.cc:17
#5  node_napi_env__::~node_napi_env__ (this=0x558381109cd0, __in_chrg=<optimised out>) at ../src/node_api.cc:17
#6  0x000055837b4fa0c2 in napi_env__::Unref (this=<optimised out>) at ../src/js_native_api_v8.h:77
#7  operator() (arg=<optimised out>, __closure=0x0) at ../src/node_api.cc:103
#8  _FUN () at ../src/node_api.cc:104
#9  0x000055837b4d18ec in node::Environment::RunCleanup (this=this@entry=0x558380ff2830) at ../src/env.cc:661
#10 0x000055837b4797ef in node::FreeEnvironment (env=0x558380ff2830) at ../src/api/environment.cc:371
#11 0x000055837b57deb1 in node::FunctionDeleter<node::Environment, &node::FreeEnvironment>::operator() (pointer=<optimised out>, 
    this=0x7ffed4b26e30) at ../src/util.h:636
#12 std::unique_ptr<node::Environment, node::FunctionDeleter<node::Environment, &node::FreeEnvironment> >::~unique_ptr (
    this=0x7ffed4b26e30, __in_chrg=<optimised out>) at /usr/include/c++/10/bits/unique_ptr.h:361
#13 node::NodeMainInstance::Run (this=this@entry=0x7ffed4b26f80, env_info=env_info@entry=0x55837fa57780 <node::env_info>)
    at ../src/node_main_instance.cc:135
#14 0x000055837b4f7908 in node::Start (argc=<optimised out>, argv=<optimised out>) at ../src/node.cc:1078
#15 0x000055837cb7cb63 in main (argc=2, argv=0x7ffed4b271b8) at ../src/node_main.cc:127
ketchup% node --version
v12.20.0

Debug node built is from HEAD today (d90fa196c5540109bf9c5063f8c51673340ad9e3). Ubuntu 20.10, amd64.


I found this trying to diagnose https://github.com/facebook/jest/issues/10289 ; this createContext / runInContext dance is how Jest works. However, I assume Jest works for most people most of the time, so it can't always segfault. As far as I can see, Jest always use Script to load user code, with most core modules require'd like in the second bit of code.

FauxFaux commented 3 years ago

Ohhh yeah.

==2068517==ERROR: AddressSanitizer: heap-use-after-free on address 0x607000005fc0 at pc 0x562d141221c2 bp 0x7fff137d7c30 sp 0x7fff137d7c20
READ of size 1 at 0x607000005fc0 thread T0
    #0 0x562d141221c1 in Finalize ../src/js_native_api_v8.cc:280
    #1 0x562d141d3fed in v8impl::RefTracker::FinalizeAll(v8impl::RefTracker*) ../src/js_native_api_v8.h:43
    #2 0x562d141d3fed in napi_env__::~napi_env__() ../src/js_native_api_v8.h:66
    #3 0x562d141d3fed in node_napi_env__::~node_napi_env__() ../src/node_api.cc:17
    #4 0x562d141d3fed in node_napi_env__::~node_napi_env__() ../src/node_api.cc:17
    #5 0x562d141bf63a in napi_env__::Unref() ../src/js_native_api_v8.h:77
    #6 0x562d141bf63a in operator() ../src/node_api.cc:103
    #7 0x562d141bf63a in _FUN ../src/node_api.cc:104
    #8 0x562d140e2562 in node::Environment::RunCleanup() ../src/env.cc:661
    #9 0x562d13f0cd14 in node::FreeEnvironment(node::Environment*) ../src/api/environment.cc:371
    #10 0x562d1449424f in std::unique_ptr<node::Environment, node::FunctionDeleter<node::Environment, &node::FreeEnvironment> >::~unique_ptr() ../src/util.h:636
    #11 0x562d1449424f in node::NodeMainInstance::Run(node::EnvSerializeInfo const*) ../src/node_main_instance.cc:135
    #12 0x562d141b34ca in node::Start(int, char**) ../src/node.cc:1078
    #13 0x562d18d8ebb6 in main ../src/node_main.cc:127
    #14 0x7f058f7bacb1 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x28cb1)
    #15 0x562d13ef402d in _start (/home/faux/clone/node/out/Debug/node+0x11d202d)
0x607000005fc0 is located 64 bytes inside of 80-byte region [0x607000005f80,0x607000005fd0)
freed by thread T0 here:
    #0 0x7f058fda3f7f in operator delete(void*, unsigned long) (/lib/x86_64-linux-gnu/libasan.so.6+0xb2f7f)
    #1 0x562d14121f8a in ~Reference ../src/js_native_api_v8.cc:292
    #2 0x562d1414f9f3 in Delete ../src/js_native_api_v8.cc:248
    #3 0x562d1414f9f3 in napi_delete_reference ../src/js_native_api_v8.cc:2391
    #4 0x7f05895529df in Napi::Reference<Napi::Object>::~Reference() /home/faux/code/testcase-jest-leak/node_modules/node-addon-api/napi-inl.h:2470
    #5 0x7f058954ab9c in ~ObjectWrap /home/faux/code/testcase-jest-leak/node_modules/node-addon-api/napi-inl.h:3604
    #6 0x7f058954b731 in ~ObjectInfo ../src/weakref.cc:8
    #7 0x7f058954b751 in ~ObjectInfo ../src/weakref.cc:8
    #8 0x7f058954b807 in FinalizeCallback /home/faux/code/testcase-jest-leak/node_modules/node-addon-api/napi-inl.h:4022
    #9 0x562d141d4465 in napi_env__::CallFinalizer(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(napi_env__*)#1}::operator()(napi_env__*) const ../src/js_native_api_v8.h:107
    #10 0x562d141d4465 in void napi_env__::CallIntoModule<napi_env__::CallFinalizer(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(napi_env__*)#1}, void (napi_env__*, v8::Local<napi_env__::CallFinalizer(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(napi_env__*)#1}::Value>)>(napi_env__::CallFinalizer(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(napi_env__*)#1}&&, void (&&)(napi_env__*, v8::Local<napi_env__::CallFinalizer(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(napi_env__*)#1}::Value>)) ../src/js_native_api_v8.h:95
    #11 0x562d141d4465 in napi_env__::CallFinalizer(void (*)(napi_env__*, void*, void*), void*, void*) ../src/js_native_api_v8.h:106
    #12 0x562d14122117 in Finalize ../src/js_native_api_v8.cc:274
    #13 0x562d141d3fed in v8impl::RefTracker::FinalizeAll(v8impl::RefTracker*) ../src/js_native_api_v8.h:43
    #14 0x562d141d3fed in napi_env__::~napi_env__() ../src/js_native_api_v8.h:66
    #15 0x562d141d3fed in node_napi_env__::~node_napi_env__() ../src/node_api.cc:17
    #16 0x562d141d3fed in node_napi_env__::~node_napi_env__() ../src/node_api.cc:17
    #17 0x562d141bf63a in napi_env__::Unref() ../src/js_native_api_v8.h:77
    #18 0x562d141bf63a in operator() ../src/node_api.cc:103
    #19 0x562d141bf63a in _FUN ../src/node_api.cc:104
    #20 0x562d140e2562 in node::Environment::RunCleanup() ../src/env.cc:661
    #21 0x562d13f0cd14 in node::FreeEnvironment(node::Environment*) ../src/api/environment.cc:371
    #22 0x562d1449424f in std::unique_ptr<node::Environment, node::FunctionDeleter<node::Environment, &node::FreeEnvironment> >::~unique_ptr() ../src/util.h:636
    #23 0x562d1449424f in node::NodeMainInstance::Run(node::EnvSerializeInfo const*) ../src/node_main_instance.cc:135
    #24 0x562d141b34ca in node::Start(int, char**) ../src/node.cc:1078
    #25 0x562d18d8ebb6 in main ../src/node_main.cc:127
    #26 0x7f058f7bacb1 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x28cb1)
previously allocated by thread T0 here:
    #0 0x7f058fda2f17 in operator new(unsigned long) (/lib/x86_64-linux-gnu/libasan.so.6+0xb1f17)
    #1 0x562d14126536 in New ../src/js_native_api_v8.cc:320
    #2 0x562d14126536 in Wrap<(v8impl::<unnamed>::WrapType)0> ../src/js_native_api_v8.cc:616
    #3 0x562d1414a9bf in napi_wrap ../src/js_native_api_v8.cc:2234
    #4 0x7f058954a9c1 in ObjectWrap /home/faux/code/testcase-jest-leak/node_modules/node-addon-api/napi-inl.h:3585
    #5 0x7f058954a042 in ObjectInfo ../src/weakref.cc:10
    #6 0x7f058954cd7f in operator() /home/faux/code/testcase-jest-leak/node_modules/node-addon-api/napi-inl.h:3944
    #7 0x7f058954e344 in WrapCallback<Napi::ObjectWrap<T>::ConstructorCallbackWrapper<(anonymous namespace)::ObjectInfo>::<lambda()> > /home/faux/code/testcase-jest-leak/node_modules/node-addon-api/napi-inl.h:73
    #8 0x7f058954cec9 in ConstructorCallbackWrapper /home/faux/code/testcase-jest-leak/node_modules/node-addon-api/napi-inl.h:3942
    #9 0x562d14123337 in operator() ../src/js_native_api_v8.cc:493
    #10 0x562d14123337 in CallIntoModule<v8impl::(anonymous namespace)::CallbackWrapperBase::InvokeCallback()::<lambda(napi_env)> > ../src/js_native_api_v8.h:95
    #11 0x562d14123337 in InvokeCallback ../src/js_native_api_v8.cc:492
    #12 0x562d14123337 in Invoke ../src/js_native_api_v8.cc:510
    #13 0x562d150842f2 in v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) ../deps/v8/src/api/api-arguments-inl.h:158
    #14 0x562d15096586 in HandleApiCallHelper<true> ../deps/v8/src/builtins/builtins-api.cc:111
    #15 0x562d15098660 in Builtin_Impl_HandleApiCall ../deps/v8/src/builtins/builtins-api.cc:137
    #16 0x562d1509a993 in v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) ../deps/v8/src/builtins/builtins-api.cc:129
    #17 0x562d18594fbf in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit (/home/faux/clone/node/out/Debug/node+0x5872fbf)
    #18 0x562d18332920 in Builtins_JSBuiltinsConstructStub (/home/faux/clone/node/out/Debug/node+0x5610920)
    #19 0x562d188a47a9 in Builtins_ConstructHandler (/home/faux/clone/node/out/Debug/node+0x5b827a9)
    #20 0x562d1833e723 in Builtins_InterpreterEntryTrampoline (/home/faux/clone/node/out/Debug/node+0x561c723)
    #21 0x562d1833e723 in Builtins_InterpreterEntryTrampoline (/home/faux/clone/node/out/Debug/node+0x561c723)
    #22 0x562d1833e723 in Builtins_InterpreterEntryTrampoline (/home/faux/clone/node/out/Debug/node+0x561c723)
    #23 0x562d1833e723 in Builtins_InterpreterEntryTrampoline (/home/faux/clone/node/out/Debug/node+0x561c723)
    #24 0x562d1833e723 in Builtins_InterpreterEntryTrampoline (/home/faux/clone/node/out/Debug/node+0x561c723)
    #25 0x562d1833e723 in Builtins_InterpreterEntryTrampoline (/home/faux/clone/node/out/Debug/node+0x561c723)
    #26 0x562d1833e723 in Builtins_InterpreterEntryTrampoline (/home/faux/clone/node/out/Debug/node+0x561c723)
    #27 0x562d1833e723 in Builtins_InterpreterEntryTrampoline (/home/faux/clone/node/out/Debug/node+0x561c723)
    #28 0x562d1833e723 in Builtins_InterpreterEntryTrampoline (/home/faux/clone/node/out/Debug/node+0x561c723)
    #29 0x562d18335179 in Builtins_JSEntryTrampoline (/home/faux/clone/node/out/Debug/node+0x5613179)
    #30 0x562d18334f57 in Builtins_JSEntry (/home/faux/clone/node/out/Debug/node+0x5612f57)
    #31 0x562d1581b6d1 in v8::internal::GeneratedCode<unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, long, unsigned long**>::Call(unsigned long, unsigned long, unsigned long, unsigned long, long, unsigned long**) ../deps/v8/src/execution/simulator.h:142
    #32 0x562d1581b6d1 in Invoke ../deps/v8/src/execution/execution.cc:369
    #33 0x562d1581c6b4 in v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) ../deps/v8/src/execution/execution.cc:463
    #34 0x562d14e76258 in v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) (/home/faux/clone/node/out/Debug/node+0x2154258)
SUMMARY: AddressSanitizer: heap-use-after-free ../src/js_native_api_v8.cc:280 in Finalize

...which I still don't understand, but I read as... the fiddling with refs is making us try and finalize something twice, and then we would be triggering the double free? I don't see where n-api is deciding the refcounts are wrong, and I also don't understand ObjectInfo's fiddling of refcounts in the constructor, so I'm quite stuck.

addaleax commented 3 years ago

Sigh … yeah, at first look this seems like a bug in N-API to me. The reference counting semantics for it are generally very unclear – https://github.com/nodejs/node/pull/35777 may or may not be related to this, so it might be worth trying to see what happens for different Node.js versions.

FauxFaux commented 3 years ago

I bisected this down to https://github.com/nodejs/node/commit/53ca0b9ae145c430842bf78e553e3b6cbd2823aa / https://github.com/nodejs/node/pull/28428

Merged just before 12 went LTS, which explains why I thought this had happened "forever"; we only run LTSes.

% g bisect log | egrep '# segfault|# clean' segfault: [9622fed3fb2cffcea9efff6c8cb4cc2def99d75d] 2020-01-07, Version 12.14.1 'Erbium' (LTS) clean: [351c02d1318668027968e4daebdbe1c39c9ce33e] 2020-10-27, Version 10.23.0 'Dubnium' (LTS) segfault: [948f5152d7e90451439cb8040fd15237decbc229] 2020-11-16, Version 14.15.1 'Fermium' (LTS) clean: [dfea13a1688f8fda6a7132714019067e7da68d1b] test: verify inspector help url works clean: [3414bc7b25a00556af80c6ba946e28464e7e8d36] tools: update capitalized-comments rule segfault: [d68f78f10b91d9f9bfd7a4017fa2b9b01f6df99b] 2020-05-05, Version 14.2.0 (Current) clean: [88e815649bbc9529e23bef22dda17570bd972683] doc: add brackets to implicit markdown links segfault: [b4f745ec5ae515c35aa57c576b997bb1d7f96437] doc: fix syntax in N-API documentation segfault: [0e3d774ed21375675b024f555a67773821e56a9d] fs: fix existsSync for invalid symlink at win32 segfault: [9f271b8f38f7104e031938c17149a48273a435f3] doc: replace const / var with let segfault: [bdee976d541696122eaa5e9b877c0d52e26df010] tools: fix Python 3 deprecation warning in test.py segfault: [639085e410e805303b4511edee4c2a315006a7f8] build: log the found compiler version if too old segfault: [d247a8e1dc398b76558b7eb8cceace911047e912] http: emit close on socket re-use clean: [fce1a5198a89e8f62ea2d093e01971db46da9bf5] domain: do not import util for a simple type check segfault: [545f7282d126ee43cf9cfeb66c83d0cbd2c60614] src: implement v8 host weakref hooks segfault: [7de5a557103dd689bd9a757720eca72459d86e66] deps: patch V8 to 7.8.279.17 segfault: [53ca0b9ae145c430842bf78e553e3b6cbd2823aa] src: render N-API weak callbacks as cleanup hooks

I also accidentally found that ASAN isn't happy with this, which seems like it might be related.

require('weak-napi');
throw new Error();
==71934==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f4f07da4f17 in operator new(unsigned long) (/lib/x86_64-linux-gnu/libasan.so.6+0xb1f17)
    #1 0x558ee24c79cd in napi_module_register ../src/node_api.cc:497
    #2 0x7f4f01479902 in _register_weakref ../src/weakref.cc:71
    #3 0x7f4f086fbdbd  (/lib64/ld-linux-x86-64.so.2+0x11dbd)