nodejs / nan

Native Abstractions for Node.js
MIT License
3.29k stars 506 forks source link

RFC: A new kind of async #715

Open agnat opened 7 years ago

agnat commented 7 years ago

I always felt that the AsyncWorkers are suffering from a design defect. Right at its very core is a very subtle anti-pattern. I call it "composition by inheritance". Does that sound oxymoronic? Good. It always starts like this:

class AbstractSomething {
  virtual void doThing() = 0;
};

If you think about adding a feature by inheritance and about exposing both AbstractSomething and FeatureSomething to the user of your library, stop right there. You are almost certainly about to commit "composition by inheritance".

Another symptom of the anti-pattern is daisy-chaining. These class hierarchies never branch. Nobody wants to cut off a feature. Instead both sides, the library developers and the users, start accepting overhead. It becomes acceptable to tap into the class daisy chain at a point that gives you the features you need. If you inherit a couple of things you don't need, so be it.

It becomes acceptable because the alternative is a proliferation of Somethings covering more feature permutations. This is so undesirable that we accept the overhead. Note, how the combinatory nature of our features has "toolbox" written all over it. Instead, "composition by inheritance" always creates a monolith. While the problem at hand requires loose coupling, our design uses the tightest coupling possible: inheritance.

So, what now? A few weeks back i started hacking on a little sketch exploring new ways to write asynchronous node addons. Beside loose coupling and a more toolboxy approach i was looking for a design that is familiar to node developers.

In javascript it would seem natural to pass work functions and done handlers. So let's do this in C++. C++ already has a concept named callable. Relying on a language concept provides just the kind of loose coupling we are seeking. No inheritance, not even a real API. If it is callable, we're good.

Thinking about designs that take an arbitrary callable and execute it on the thread pool, another API comes to mind: Apple's grand central dispatch. Both environments have the notion of a main thread and they both have a thread pool. This thought gave the sketch its current direction and a name: ncd – not central dispatch. Please take a look and let me know what you think.

From the (current) README:

void
eventEmittingWorker(Nan::FunctionCallbackInfo<Value> const& args) {
  using namespace std::string_literals;
  unsigned delay      = args[0]->Uint32Value();
  unsigned iterations = args[1]->Uint32Value();

  ncd::AsyncEventEmitter emitter(args[2].As<v8::Object>());  // ①

  ncd::defaultWorkQueue().dispatch([=](){                    // ②
    for (unsigned i = 0; i < iterations; ++i) {
      emitter.emit("progress"s, i);                          // ③
      usleep(delay);
    }
  }, std::bind(emitter.emit, "done"s));                      // ④
}

After grabbing some arguments the code creates an AsyncEventEmitter in (1). It wraps a javascript EventEmitter for use on a different thread. In (2) a lambda expression is dipatched to the thread pool. The expression captures copies of delay, iterations and the emitter. This is an ncd pattern: An AsyncSomething is first allocated on the main thread and then copied around to different threads. The call to dispatch(...) returns immediately and the lambda is launched on the thread pool. In (3) the async event emitter is invoked, sending progress events back to javascript. After the execution finishes the callback passed in (4) is invoked on the main thread. Here we just emit a done event.

The javascript code looks like this:

const workers = require('workers')
  , EventEmitter = require('events')
  , ee = new EventEmitter()
    . on('progress', (p) => { console.log('progress', p) })
    . on('done',     ()  => { console.log('done') })

workers.eventEmittingWorker(10000, 10, ee)
kkoopa commented 7 years ago

I like the general idea. NAN itself is more of a toolbox than a monolithic framework. I have at some point referred to it as pick-and-mix. The AsyncWorker classes have a long legacy (and C++03 limitations) and are intended as a simple way of doing the most common tasks as well as a showcase of how to (correctly) combine libuv and node in user code.

I will start the discussion with a question: Why not use the std-like way of having free functions with an argument for an Executor, instead of the other way around? That is, why Executor::dispatch instead of dispatch(..., Executor = defaultWorkQueue)?

agnat commented 7 years ago

Hehe. That's an excellent question! Let me answer them both.

The implicit one first:

Why isn't the default queue the default queue?

One of the guiding principles is to use node conventions and patterns. In node the final done callback is always the last argument. The argument order of dispatch(...) follows this convention. If you look at the non-default case you'll probably agree that a trailing queue parameter is odd ... if not outright ugly.

Free vs. member function:

Indeed the choice here is arbitrary. In fact I looked at both variants and liked the free functions, too. I decided to keep the member functions because I think it feels slightly more natural to javascript developers.

agnat commented 7 years ago

The AsyncWorker classes [...] are intended as a simple way of doing the most common tasks as well as a showcase of how to (correctly) combine libuv and node in user code.

I agree... However, if I'm perfectly honest, whenever I invoke the statement, it feels a bit like an excuse. The first part acknowledges that the design doesn't scale, the second part encourages people to roll their own. The second part however does not work. Everything out there is a AsyncWorker.

I believe this is holding us back. I believe removing the tight coupling will enable us to develop a new generation of tools. Take a look at the (experimental) stream example. It features a full fledged node writable stream with back-pressure and all, that has a readable endpoint on a different thread. It's not finished but it looks very promising. Since node writable streams are event emitters this already is very powerful. Now think about thread transformers...

kkoopa commented 7 years ago

One could have the Executor / Queue as the first argument, yet still have it optional through an overload set.

/* satisfies concept Executor */
struct DefaultWorkQueue = {
  void execute() { ... }
};

template <typename Executor, typename F, typename H>
void dispatch(Executor &&executor, F callback, H emit);

template <typename F, typename H>
void dispatch(F &&callback, H &&emit) { dispatch(DefaultWorkQueue(), std::forward(callback), std::forward(emit)); }

Free functions increase encapsulation.

agnat commented 7 years ago

lol. No. I'm not against free functions. On the contrary. But I'd rather not add overloads, creating oddball signatures just to fold an argument. Explicit is good and NAN has a certain history of folding the wrong things... ;)

I also believe it makes sense to talk about this on a higher level first. Like, maybe I'm wrong. Maybe the reason why we don't see more asynchronous primitives has nothing to do with the design of AsyncWorker. Maybe it's because AsyncWorker actually covers 99% of the use cases and ncd is just a huge pile of wannabe C++ mumbo-jumbo that nobody needs.

I'd also love to hear from @mkrufky on this. You've been in there just recently and maybe could speak about how it is to maintain them workers?

Another question is how to release it. We don't have to merge it into NAN. It also works as a companion library. One argument against a merger could be that ncd is very different from NAN proper. Not just style-wise, that too, but also on a deeper level.

If you look at line (3) of the OP, there clearly is some automatic conversion going on. Besides that the current implementation will not cut it and more work is needed, this is a change in paradigm. Maybe it's to confusing if parts of NAN offer automatic conversions and others don't. It also is quite a commitment...

These are just some ideas... Curious about your thoughts, questions and concerns regarding the bigger picture...

kkoopa commented 7 years ago

Since it uses modern features and such, I presume it could not be merged into NAN 2 as-is, but would have to wait for a future major release (if that happens). I always thought there would be more intermediate libraries built on top of NAN, since it primarily papers over version differences, but that never became too popular.

I speculate that the majority use case is: Copy boilerplate from some existing project, add own stuff, see that it runs and move on with other matters.

mkrufky commented 7 years ago

I haven't yet had a chance to fully review the ncd repository but I love the idea, and what I've seen so far looks really nice.

I'm not sure if it actually belongs within nan, however. But, for that matter, the AsyncWorkers in nan today hardly feel like they belong to the abstraction layer that most people think about when they think about nan. With that said, we should probably think about inclusion within nan as a separate discussion when ncd reaches some milestone point.

As for the current AsyncWorker: I have always found the AsyncWorker to be a little bit awkward to work with, but the AsyncProgressWorker is especially difficult to wrap one's head around.

For years, there have been cases popping up here and there on issue trackers and StackOverflow questions asking how to reliably wake the main thread from a c/c++ function running in another thread. I'll spare everybody the details of that specific issue right now, but the takeaway is the fact that the question has been open for so long and nothing was ever done about providing a generic solution until recently. One example is here: https://github.com/nodejs/nan/issues/669

I got involved in the AsyncWorker code because I wanted to replace my own uv_async_t-based c++ class in node-dvbtee to use something more standard. I had originally written my own uv_async_t-based class because I simply didn't understand the AsyncProgressWorker at the time, and I already understood that it couldn't promise to deliver every event (without some changes). node-dvbtee is a c++ node addon module that uses native c/c++ libraries such as libdvbpsi and libdvbtee to parse an mpeg2 transport stream (perhaps continuously) and wake the main thread with JSON object events as each mpeg2 PSIP (program service information) table is received and fully parsed.

As I was working on the AsyncProgressQueueWorker, I found myself wondering whether this was even going to be the right class to solve the problem for my own needs. At first I felt that it was wrong, because I wouldn't be able to create any workers from outside the main thread. I wanted to create a smaller class that had a uv_async_t but not a uv_work_t because I wanted to be able to handle a full worker lifecycle from a side-thread, while its job execution occurred in the main thread, only. Eventually I realized that no mater how you look at it, one should be creating these workers in the main thread and during execution the running c/c++ functions will have access to the main thread waking ExecutionProgress classes as needed. The problem is that this is not clear until after you've spent lots of time analyzing the worker and examples of its use. To be honest, I might not have even described it properly here, but I'm sure you understand what I'm saying.

By the time I had gotten the AsyncProgressQueueWorker out the door, mocha released their newest release version that hangs at the end of the tests if resources haven't been cleaned up properly. See here: https://github.com/mochajs/mocha/issues/3044 . It turns out that all of the uv_async_t-based worker classes that I have written had resource issues that got fully resolved by a conversion to inherit AsyncProgressQueueWorker.

However, AsyncProgressQueueWorker does a copy that I want to avoid, so I created yet another helper class called AsyncFactoryWorker which I plan to propose in a new PR one of these days, especially now that we've released nan version 2.8.0

Why don't these workers get the attention they might deserve? Why aren't there more example use cases to look at? I think it may be because Node.js is more widely used by engineers that work in the server space as opposed to engineers that work in the embedded space. Engineers that work in the embedded space tend to have a totally different coding style than those working in the server space, and it may require more flexible helper classes in order to make these bindings feel more natural for developers that need to bind to native c/c++ code.

There are tons of c / c++ libraries that engineers may need to link to for server applications, and AsyncWorker in general may quite possibly be enough to enable developers to write those bindings. However, when it comes to writing bindings for more embedded style applications, it may be much more useful to have the added flexibility that ncd might offer.

Can it all be done with the AsyncWorker that we have today? Yes, probably it can be used to solve most problems, but you raise a good point: Are bindings developers simply using AsyncWorker because it's the only thing out there and they don't want to roll their own, perhaps more appropriate solution?

This comment is turning into a run-on... I think that most problems can indeed be solved with the workers available today, but the idea of ncd may allow these same problems to be solved in a more efficient or maybe even more of an elegant way without possible additional overhead.

mkrufky commented 7 years ago

Also we should keep in mind that ncd looks like it will require a minimum of c++17, while nan itself is compatible with c++03. When we drop support in nan for node 0.10 & 0.12, then it only pushes up the requirement to c++11. Better to keep nan as backwards compatible as possible for as long as it makes sense. Then again, I don't think it hurts to add new functionality to nan that requires a newer compiler (and only build it conditionally, if supported by the compiler). It just means that these new features wont necessarily be available without all of the proper requirements.

Back to ncd - It really does look like something that might feel more natural to developers that want to create native bindings in newer, more flexible and interesting ways. I think it's a great idea and you should definitely move forward with it.

kkoopa commented 7 years ago

Regarding C++ versions available: The worst compiler which can successfully compile any given version of Node (including V8) sets the bar. C++11 (g++ 4.8, VS 2013, clang whatever) for Node 4 through 8. Node 9 (because of V8) requires at least VS 2015 (V8 started using constexpr constructs that VS2013 cannot do). All features of NAN should work on all supported versions of Node.

agnat commented 7 years ago

Regarding the C++ standard: It's designed to only require C++11. If it does use anything else that's either an accident or me being lazy or childish. It evolved from a sketch...

Even a C++03 port seems feasible. Not that I would suggest that, but it is possible quite easily...

I will comment on the other subjects this weekend (I hope)...