revelrylabs / elixir-nodejs

An Elixir API for calling Node.js functions
MIT License
217 stars 31 forks source link

Implementation leaks #45

Closed Zinggi closed 4 years ago

Zinggi commented 4 years ago

Hi there :wave:

As for my other issue, I created a failing test for this one.

The way this library uses Task.async / Task.await leaks into the caller which makes using this library inside a GenServer very error prone and difficult.

There are two issues here that leak to the caller: 1) Worker process crashes 2) Stray messages after timeouts

Worker process crashes

When the worker process crashes (which atm happens when using console.log), the caller process also crashes.

This happens because the library uses Task.await(), which links the caller with the spawned process, so on a crash, the caller crashes too. This is expected behavior for a Task, as per the documentation ("In case the task process dies, the current process will exit with the same reason as the task.")

Even if the issue with console.log is fixed, there might be other crashes that shouldn't leak to the caller.

In https://github.com/revelrylabs/elixir-nodejs/pull/45/commits/484c152052b76ea4e9a2450e87aa37ada81629a6 there is a possible fix for this.

Timeouts

When using NodeJS.call inside a GenServer, the calling process receives an unintended message when a timeout happens. If the GenServer doesn't handle this, it will crash.

This happens because of the use of Task.await(). The process started by Task.async can still send a message back once it's done, even after running into the timeout specified by Task.await.

This leads to a stray message being sent to the calling GenServer process which needs to be handled.

This is probably why the docs do not recommend to use Task.await inside a GenServer: "It is not recommended to await a long-running task inside an OTP behaviour such as GenServer. Instead, you should match on the message coming from a task inside your c:GenServer.handle_info/2 callback"

https://github.com/revelrylabs/elixir-nodejs/pull/45/commits/572579bcdd3576e09524920885bbc82c9f1437d3 has a possible fix. I think this makes the code much cleaner and also avoids the race condition between the Task.await timeout and the GenServer.call timeout.

But I'm not sure if this has any other unintended consequences.