hmenyus / node-calls-python

Call Python from NodeJS directly in-process without spawning processes
MIT License
252 stars 26 forks source link

Hard crashes Node.js when a worker thread is terminated #69

Closed smoores-dev closed 7 months ago

smoores-dev commented 7 months ago

Howdy! This is a great project! I've been using it to integrate Storyteller with the whisperx and fuzzysearch python libraries. It works very well in the general case, but:

Storyteller runs node-calls-python from a Worker thread, via piscina. Piscina supports cancelling workers, and does so by calling worker.terminate(). When this callsite is reached, if the Worker is running Python code, rather than terminating, the current Python call stack seems to complete (which can take a while, especially if it's in the middle of running a whisperx transcription), followed by one of a few different C++ errors:

terminate called after throwing an instance of 'std::runtime_error'
  what():  napi_set_property(env, object, convert(env, key), convert(env, value)) returned with an error: 10

OR

double free or corruption (out)

After which the entire Node.js runtime crashes, up through the parent process that kicked off the Piscina worker.

For what it's worth, https://github.com/mmomtchev/pymport, a similar project, has very similar issues.

Anyway, I don't even know if this is something you can resolve here, or if it's actually an issue with terminating worker threads while Node.js is running any addon code, but I figured I would give it a shot!

Here's a minimal reproduction: https://github.com/smoores-dev/node-python-worker-repro/tree/node-calls-python. It's not as minimal as I would have liked; in order to actually reproduce the issue, I had to:

This is all to say, it seems like the primary issue is that there doesn't seem to be a great way to signal a Python call to stop, and terminating the parent Worker won't stop the Python code from running.

Secondarily, there seems to be an issue that is perhaps specific to running pytorch code that results in runtime-crashing C++ exceptions if the Worker is terminated.

hmenyus commented 7 months ago

Could you try 1.8.5?

smoores-dev commented 7 months ago

Same issue on 1.8.5! Seems like this is related to a complex interaction between Node.js worker threads, NAPI calls being uninterruptible, and timing of long-running Python functions. The pymport maintainer explained and helped me out here: https://github.com/mmomtchev/pymport/issues/193#issuecomment-2067778815. I don't think this is actually an issue with node-calls-python, and I don't think either of you should add that header flag to your libraries, so I'm happy to close this!