mongodb-js / boxednode

📦 boxednode – Ship a JS file with Node.js in a box
Apache License 2.0
599 stars 10 forks source link

Native addon promises or callbacks appear broken #36

Closed chuzarski closed 1 year ago

chuzarski commented 1 year ago

Boxednode version: 1.12.0 Target node version: 16.18.0 OS: Windows 10

Hello,

I have been dealing with an issue for a few weeks where it appears that promises or callbacks with native addons are broken. With specific libraries, awaited function calls never return running under boxednode, but work as expected under stock Node JS. The libraries in question where I’m running into this behavior are ibm_db and edge-js. However, the oracledb library uses a native addon with promise returning functions and works as expected. Interestingly, ibm_db and edge-js both directly use nan and libuv, where oracledb uses napi, so anecdotally, I think that may be where the problem lies. I do not have deep knowledge of how nan, libuv or napi work from a node addon perspective, so take that with a grain of salt.

I was unable to come up with a standalone proof of concept of this occurring, however I was able to write a simple proof of concept of this issue occurring with ibm_db. Here is the repo: https://github.com/chuzarski/ibm_db-async . IBM DB2 is not required to run the proof of concept.

My last anecdotal point, I attached a msvc debugger to my running project which uses edge-js, and was able to trace execution to this function call, after which, the stack for the native module unwinds and the awaited function in the nodejs script never returns. Lastly, it appears ibm_db fails to return a promise or call a callback after this function runs. Both functions appear to use functions from libuv.

I would gladly submit a P/R for a fix if I knew the correct direction to look, however, I’m a little short on knowledge of low-level nodejs. In the meantime, I’ll work on updating this issue with a clearer example.

Thank you, Cody

addaleax commented 1 year ago

Interestingly, ibm_db and edge-js both directly use nan and libuv, where oracledb uses napi, so anecdotally, I think that may be where the problem lies.

Yeah, this isn’t quite the cause but it’s definitely strongly correlated :slightly_smiling_face: The issue is that ibm_db and edge-js are old addons, which means that a) they use legacy technologies like nan and direct access to libuv b) they weren’t written with modern Node.js versions (>= 8) in mind, which means that they are incompatible with Worker threads, embedding scenarios, etc.

In particular, these addons assume that Node.js uses the global singleton “default” event loop, and only that. I’ve opened https://github.com/mongodb-js/boxednode/pull/37 to add the option to use that, if so desired. I’d still be careful about using this type of “legacy” addon in production environments, but I understand that you may not always have a choice about this. :slightly_smiling_face: If you do, it might be worth filing bugs about making these addons look up the correct event loop.

(Newer addons which use Node-API typically don’t have this type of issue, not just because they don’t often use libuv directly, but mainly because they don’t make assumptions that were being invalidated around the time when Node-API was introduced.)

chuzarski commented 1 year ago

That makes a lot of sense, thank you for helping me understand. I tested your changes, and they completely fix the ibm_db package. Edge-js causes a segfault when loaded, with the following error:

Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at uv_default_loop()
   at V8SynchronizationContext.Initialize()
Segmentation fault

However, I understand that this may be an undesirable approach. I’ll investigate filing issues or PRs with both projects.

Thank you very much!

addaleax commented 1 year ago

@chuzarski It's still a bit surprising if that happens with boxednode but not regular Node.js. But yeah, I wouldn't be surprised if these legacy addons just aren't written to expect modern Node.js behavior. Feel free to open a new issue if you run into something!