clobberos / xpc-connect

Mac connection bindings for node.js
https://npmjs.com/package/xpc-connect
MIT License
15 stars 18 forks source link

fix(XpcConnect.cpp): Fix multiple crashing bugs and memory leaks #34

Closed azsn closed 3 years ago

azsn commented 3 years ago

Hi there! I encountered a crashing bug while using xpc-connect, and while trying to fix it I discovered a few other crashing bugs and memory leaks. They're mostly all related, so I went ahead and fixed all the bugs I found in this one PR. I tried my best to not add any breaking changes, although there are two things that might loosly be considered breaking, which I describe below.

The original problem I encountered was a race condition where sometimes xpc-connect would crash when calling shutdown at the same time a message was being sent from the remote. In my particular case, the client would send a message to the service which would then immediately close the connection and result in a "connection interrupted" message sent to the client, and then the client would shutdown at the same time. Sometimes the shutdown call would happen in the middle of the "connection interrupted" event being asynchronously queued, while eventQueueMutex or asyncHandle were being used, which would result in segmentation faults or aborts when they were used after destroy.

Overall, here are the bugs I fixed / changes I made. Sorry it's so many in one PR, but most of them all had to happen together / were fixed with the same changes.

  1. Crashing bug: Fix the original bug above by only destroying everything after the connection is fully closed. So all objects are now destroyed after the XPC_ERROR_CONNECTION_INVALID is received and handled (this is important, to prevent this or eventQueueMutex from becoming destroyed too early). "connection invalid" is always the last message sent on an XPC connection when it is closed (cancelled), so it is guaranteed to be handled.
  2. Crashing bug: Calling setup twice, or shutdown before/without setup, would crash the program. Fixed with checks and throwing a JS error if setup is called twice. I made shutdown silently fail, since there's not really any harm in letting it be called multiple times, and I imagine it could be convenient in some situations.
  3. Resource leak: Use xpc_connection_cancel instead of xpc_connection_suspend. _suspend pauses a connection, but _cancel actually closes it and results in the aforementioned "connection invalid" error.
  4. Memory leak: Call xpc_release on the XPC connection after it closes to free it.
  5. Memory leak: Call dispatch_release on the dispatch queue after the connection is freed.
  6. Memory leak: The XpcConnect class instance was never destroyed because it held a persistent reference to itself through the This variable. As far as I could tell, there is no point to that. So I removed This and replaced its usage Nan::New<Object>(this->This) with this->handle() which, as far as I can tell, is identical.
  7. Crashing bug: This bug was actually introduced by allowing XpcConnect instances to be deallocated in [6] above -- XpcConnect could be deallocated while the async XPC event handler is running, resulting in a segmentation fault with using this. So I used this->Ref() on setup and this->Unref() on connection invalid to prevent the class instance from being deallocated while in use.
  8. Crashing bug: Also introduced by allowing XpcConnect instances to be deallocated -- it seems there is a bug with Nan::AsyncResource where it cannot be freed in an ObjectWrap's destructor. I couldn't figure out why, so I just made it construct in setup and destruct when destroying everything else after "connection invalid". This is technically a breaking change, because it means the async_hooks init and destroy events will happen at slightly different times than before. There's not really anything that can be done about that, though. And I doubt anyone is actually using async_hooks with this anyway.
  9. Optional: Throw error if sendMessage is called before setup. This is kind of a breaking change, you can remove it if you want and just make it be silent like before. I didn't also throw an error for using it after shutdown because xpc_connection_send_message intentionally never acknowledges errors, and with the asynchronous nature of connections possibly being closed at any time, it made sense to me to just leave it failing silently.
  10. Typo: xpcConnnection with three ns.

I also added tests to index.spec.js to test some of the fixes. I don't know how to test the memory leaks though, and the original crash I encountered was pretty specific to my custom XPC service, so I couldn't really write tests for that. But I did test a bunch with my own service and haven't been able to cause any crashes now.

Thanks for reading this essay of a PR description haha. Let me know if there's anything you want me to change.

azsn commented 3 years ago

Hi @jongear , any word on this?

jongear commented 3 years ago

thanks for doing this @azsn 🙏🏻

jongear commented 2 years ago

:tada: This PR is included in version 2.0.1 :tada:

The release is available on:

Your semantic-release bot :package::rocket: