nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.61k stars 29.59k forks source link

N-API: no ability to schedule work on different thread? (via uv_async_t) #13512

Closed gpean closed 6 years ago

gpean commented 7 years ago

Correct me if I'm wrong, but I didn't see a way to schedule work on a different thread using N-API, and/or a way to notify the main loop that some work is done from some other thread and that we need to wake up with a callback call (via uv_async_t and a persistent callback).

I think that's a major use case of native modules (doing stuff in // of the JS thread), so it's kind of surprising. What was the rationale during N-API design?

I was able to hack something using v8.h and uv.h, of course. But it kinda defeats the purpose of having a portable API.

mscdex commented 7 years ago

/cc @nodejs/n-api

gpean commented 7 years ago

(AFAIK napi_create_async_work allows to queue uv work to the default loop - it does not really offload actual work to a separate thread - it just fills the JS thread with more work, although it's native work)

mscdex commented 7 years ago

Looking at the code now, napi_create_async_work()/napi_queue_async_work() utilize uv_work_t and uv_queue_work() behind the scenes. uv_queue_work() enqueues a task to the libuv thread pool, so the napi_async_execute_callback value you pass to napi_create_async_work() will be executed in a separate thread when a thread becomes available. The napi_async_complete_callback value will be called from the main thread after the work thread callback is finished.

jasongin commented 7 years ago

At this time, it is not a goal for N-API to cover all of the functionality offered by libuv. We're certainly open to feedback about this; it's possible we don't have quite the right balance.

We (the N-API working group) discussed extensively how much of libuv should be covered, if any at all. Since libuv is a C API and has infrequent major-version updates, it is already a very stable API (unlike V8 which tends to make breaking changes often). And any forks of Node are likely to still use libuv even if they swap out the JS engine. So, there is clearly less need for the N-API abstraction there.

What we settled on for now was to just include basic async functionality in N-API, because it can be a little tricky to get right without some higher-level abstractions. The N-API async APIs (and the Napi::AsyncWorker class that uses those APIs) are meant to provide a migration path from NAN's AsyncWorker. The NAN APIs also do not allow scheduling work on a different thread. We are assuming that the functionality offered by NAN is good enough for most native addons. Addons that need more advanced threading capabilities can use the libuv APIs directly, as they always have.

I was able to hack something using v8.h and uv.h, of course. But it kinda defeats the purpose of having a portable API.

Why did you need v8.h? As explained above, the expectation is that you should be able to mix in just libuv APIs for that purpose, while still using N-API instead of V8 for everything else.

mscdex commented 7 years ago

The NAN APIs also do not allow scheduling work on a different thread.

Doesn't Nan::AsyncQueueWorker do this?

jasongin commented 7 years ago

Nan::AsyncQueueWorker() uses the uv default loop. napi_queue_async_work() does also.

mscdex commented 7 years ago

Ok, I guess I'm confused about what's being asked then. I thought @gpean was looking to do work on another thread, which those methods do (in the libuv thread pool).

jasongin commented 7 years ago

I think the issue is that Node also uses the uv default loop as its main event loop.

gpean commented 7 years ago

Looking at the code now, napi_create_async_work()/napi_queue_async_work() utilize uv_work_t and uv_queue_work() behind the scenes. uv_queue_work() enqueues a task to the libuv thread pool, so the napi_async_execute_callback value you pass to napi_create_async_work() will be executed in a separate thread when a thread becomes available.

I was not very familiar with libuv and the builtin thread pool you mention - I just read http://docs.libuv.org/en/v1.x/threadpool.html. One issue I see is that you cannot control the pool size other than via an environment variable? It also still competes with the JS work and operations, since this is the same loop. But, I agree it covers most of the use cases for async work.

There can still be use cases where a native library you want to integrate already has its threading done, or you don't want to use the libuv thread pool, and you might just want to notify JS when stuff is available, using uv_async_t. Not the majority of use cases, but still a valid and broad use case IMO.

RE why you need to include v8.h: you won't be able to get a napi_env from the uv_async callback- You have to use v8 API to get the current Isolate, and cast that into a napi_env, basically. We would need a napi_get_current_env or similar?

gpean commented 7 years ago

Said otherwise, the issue I see is that, napi_create_reference allows to persist a value, and I can use that to persist the JS callback function that the uv_async callback should call later. But, it's not possible persist the napi_env alongside with the persisted JS callback, and there is no function to get the current env (isolate), either.

mhdawson commented 7 years ago

@gpean is the need to schedule work on a separate thread for an existing module or something new ?

mhdawson commented 7 years ago

Adding to @jasongin comments we only included coverage for the use case provided by AsyncWorker as it seemed to be broadly used in existing modules. If there is another pattern that is also broadly used in the existing modules we can take a look. Best way to start would be a list of modules that use the additional pattern.

gpean commented 7 years ago

@mhdawson it's for new stuff

I propose that we focus on the following, non major, actionable issues:

jasongin commented 7 years ago

No way to get current isolate/current napi_env (see my last comment) prevents extending functionality through libuv

You can stash the napi_env in the data field of uv_work_t, either directly or as part of a context structure you might already need for other purposes. That uv_work_t is passed back to the complete callback, where you would then use it when invoking a JS callback. So, I don't think you're actually blocked.

We had a napi_get_current_env() API for a time, but we removed it because we were concerned we might have to deprecate it eventually, because use of v8::Isolate::GetCurrent() is discouraged within Node now and may be deprecated later. And we didn't see any scenarios where it is really needed; as explained above, there should always be some way to pass the napi_env around as part of some context.

jasongin commented 7 years ago

use v8 API to get the current Isolate, and cast that into a napi_env

Don't do this! A napi_env is more than just a v8::Isolate. While that cast might work for some things because a v8::Isolate happens to be the first field in the structure, you shouldn't rely on that internal implementation detail, and it will lead to crashes as soon as access to some other field is attempted, such as when getting the last error info.

gpean commented 7 years ago

You can stash the napi_env in the data field of uv_work_t

You probably mean the uv_async_t data field. I thought it was "unsafe" to keep napi_env's around? Can I really just use the same isolate to perform the callback call from the uv_async callback? Is it always safe to reuse an isolate from one context to another?

& What about this statement from the doc:

Caching the napi_env for the purpose of general reuse is not allowed.

?

Don't do this! A napi_env is more than just a v8::Isolate

I know, I copied pasted the __napi_env struct decl from node_api.cpp to do that "safely". Casting was just a language shortcut. I'm adventurous but not crazy :-)

sam-github commented 7 years ago

No way to use napi_queue_async_work with something else than the default uv loop prevents using the existing async API really asynchronously from JS initiated work (so, not so async)

Please elaborate on how is this "not so async". Its what all the async operations in node do, so its as async as fs, dns.loopup, etc. Its possible to create your own uv loop, and there are a very, very few occaisons when it is necessary. execSync and friends do it, for example, so they can use uv to do the exec, but they use a different loop so as to prevent any concurrent processing or progress on the main loop. Inspector does it as well, because it needs a way to interact with a remote debugger when the main thread is blocked in the debugger. Its pretty rare, and note that both of them involve wanting to do I/O while causing the default loop to be completely blocked.

gpean commented 7 years ago

Please elaborate on how is this "not so async"

You justly mentioned two use cases where blocking syscalls, or libc or other system library calls, or potentially long computations can impact the main loop thread pool availability and that for this reason, you create a separate loop/pool for them. This is the "not so async" i'm talking about - Having everybody on the same loop means potential resource fighting (and so, implicit synchronization) for an available slot on the (fixed 4 threads wide!) thread pool, which we could make it possible to avoid easily in N-API by just letting people be able to request a new thread pool.

kkoopa commented 7 years ago

V8 isolates have completely separate states. Objects from one isolate must not be used in other isolates.

That is, you cannot call a persisted Function with another Isolate than its owner -- which implies that you must cache the Isolate in a libuv callback which calls back to JavaScript.

gpean commented 7 years ago

@kkoopa isn't that what napi_get_reference_value is for? getting a valid value (persisted function instance) from another isolate (napi_env)?

gpean commented 7 years ago

(But, I get the point, I'm happy to use the initial isolate to perform the call back- Makes things simpler!)

So, effectively, one issue remains: that you're forcing N-API users to use 4 threads to perform async work which can be potentially already used by Javascript IO work. When you have a 32 cores server, I'd think you might want to make it scale better- if you don't want to expose loop control, at least the default pool size should match the vcpu count.

kkoopa commented 7 years ago

Without looking up the function in question: No, that is not how it works. IIRC, napi_get_reference_value is the equivalent of v8::Local<T>(v8::Persistent<T>), which converts a Persistent to a Local. IIRC, this conversion does not even take or use an Isolate. The napi variant might use one for some reason, maybe it was for setting error codes. Persistent (reference) and Local (value) are handles (actually pointers in structs) to v8::Values. When you do v8::Function::Call on the handle, you pass the Isolate so the handle can be dereferenced and operated on correctly.

gpean commented 7 years ago

@kkoopa how does it work wrt reference tracking though? how do I guarantee that my JS function value won't have been collected, by the time my uv_async callback fires? I still need the reference creation to have things protected from GC no?

gpean commented 7 years ago

Also. I need to create a HandleScope in the uv_async_t callback right? If I want to create result values to pass to my JS callback. That's a v8 API not surfaced via N-API. Or am I missing something?

EDIT: yeah, visibly I missed napi_handle_scope!

kkoopa commented 7 years ago

A v8::Persistent is not reference tracked (unless made weak). It always exists. When you take a v8::Local<v8::Function> and make a v8::Persistent<v8::Function>, the Persistent lives until you call v8::Persistent::Reset. On the other hand, a v8::Local<v8::Function> is destroyed when its v8::HandleScope is destroyed. Then this reference is gone and the v8::Function (which is the JavaScript function) might be garbage collected if there are no other live references to it.

void foo(v8::FunctionCallbackInfo<v8::Value> info) {
v8::HandleScope scope;

v8::Local<v8::Function> my_callback = info[0].As<v8::Function>();

schedule_async_stuff(my_callback);
} //<--- All (local) C++ variables, HandleScope dies, along with the local handle, now there might be a risk of garbage collection
void foo(v8::FunctionCallbackInfo<v8::Value> info) {
v8::HandleScope scope;

v8::Local<v8::Function> my_callback = info[0].As<v8::Function>();

v8::Persistent<v8::Function> leaky_reference(info.GetIsolate(), my_callback);
schedule_async_stuff(my_callback);
}  //<-- There is a persistent reference stored to the function referenced by my_callback, so the function lives. However, the v8::Persistent is destroyed by C++ scoping rules, but its underlying storage cell in V8 is not, now you have no way of clearing up the storage cell
kkoopa commented 7 years ago

Yes you must have a scope in libuv callbacks, since they are outside Node (which sets up a scope before going from V8 to a C++ callback in an addon). There is both a napi_handle_scope and napi_escapable_handle_scope or something similarly named.

gpean commented 7 years ago

Thanks. At least I can remove the v8.h dependency. w00t!

kkoopa commented 7 years ago

To clear things up a bit further: A v8::Local<T> is similar to a T**. A v8::Persistent<T> is also similar to a T**, but the T* (T** derferenced once, which is the "actual JavaScript reference") is stored outside the v8::Persistent<T>. Thus, when a v8::Local<T> goes out of C++ scope, the T** also does. When the surrounding v8::HandleScope goes out of C++ scope, the T* dies. When a v8::Persistent<T> goes out of C++ scope, the T** also does, but now the T* is stored in a "global storage cell" for the v8::Isolate and lives on until you explicitly get rid of it by calling v8::Persistent<T>::Reset().

jasongin commented 7 years ago

What about this statement from the doc:

Caching the napi_env for the purpose of general reuse is not allowed.

We should remove or clarify that statement in the doc. It is OK to save the napi_env for use after returning from async work.

N-API already establishes a handle scope before invoking the napi_async_complete_callback, so you don't have to. (Though if you're not actually using the N-API async APIs, then that doesn't apply.)

mhdawson commented 7 years ago

@jasongin do you want to submit a PR to make that clarification or should I go ahead and do that ?

jasongin commented 7 years ago

I won't have time today, but if you submit a PR I can review it. Or I can get to it next week.

mika-fischer commented 7 years ago

@gpean I've put together a little library for async callbacks using uv_async_t. Maybe it's useful to you.

gpean commented 7 years ago

Thanks @mika-fisher. I have myself written a library, but its scope goes a bit beyond that, as it will generate all N-API code needed to wrap any native function, class or method (synchronously or asynchronously) using C++ templates. No need to know about N-API or libuv to write native modules anymore. The compiler does the job of figuring out the boilerplate code for you, given the types of your functions.

Very quick example:

#include <js>

std::string foo(int blah, std::string const& x) { return "hello"; }
Js::Value foo2() { return Js::makeObject("abc", 1234)("xxx", "cux"); } // Js::Value is he native-side mirror of napi_value (but it does not wrap napi_value)
void foo3(Js::Callback cb) { cb(Js::makeObject("abc", 1234)); } // callbacks can also be received and called manually via Js::Callback (from any thread)

JS_MODULE(([](Js::Exporter& exporter) {
    exporter.addAsyncFunction("foo", foo); // Then from JS: foo(123, "abc", (err, res) => { /* res contains "hello" */ })
    exporter.addFunction("foo2", foo2); // Then from JS: foo2() will return { abc: 1234, xxx: "cux" }
    exporter.addFunction("foo3", foo3); // Then from JS: foo3(console.log) will print { abc: 1234 }
}));

Note that the library schedules async work on its own thread pool. It supports objects/classes too, (no need to modify original class). It supports passing objects as arguments (you'd use std::shared_ptr<MyObject> as argument), handles refcounting of objects so that the GC will not destroy native objects if native side still uses it and vice versa. And more.

I intend to open source it when/if I get approval from Netflix. I think it could benefit the community a lot. It has cut the time it takes me to write native modules drastically.

jasongin commented 7 years ago

@gpean Very interesting! I'm looking forward to seeing the source. It sounds like at least some of those ideas might be generally useful enough to be included as part of the node-addon-api package.

gpean commented 7 years ago

Thanks, I'll do my best to open source it. I have a question related to using uv_async_t the same way @mika-fischer is doing in his library (and I am doing I think the same) - When calling back the JS side using napi_make_callback, if I resolve/reject a promise from this JS callback, sometimes it will stall/hang for a while- i.e. JS will do nothing, and resume after a random while. As if the event loop was not awaken following the resolve/reject. If I do setImmediate on the resolve or reject, then I don't see this problem anymore. Any idea what this could be/how to fix it properly? Thanks.

jasongin commented 7 years ago

Make sure you're calling napi_make_callback() from the main event loop, not from another thread / event loop.

If it's still a problem, please share sample code to reproduce it.

gpean commented 7 years ago

since I/we are using uv_async_t inited with uv_default_loop() to napi_make_callback, it should be the proper loop (/thread) right? also I would expect more dramatic failures if calling from foreign threads.

I'll work on a repro case.

gabrielschulhof commented 6 years ago

@mika-fischer @nodejs/addon-api It sounds like we may want to add an API like the following (first pass):

NAPI_EXTERN napi_status
napi_create_async_function(napi_env env,
                           napi_value js_function,
                           napi_async_function* result);

typedef napi_status
(*napi_async_function_arguments_to_js)(void* native_data,
                                       napi_value* recv_result,
                                       napi_value** argv_result);

typedef napi_status
(*napi_async_function_return_to_native)(napi_value js_exception,
                                        napi_value js_return,
                                        void** native_return_value);

NAPI_EXTERN napi_status
napi_call_async_function(napi_async_function func,
                         napi_async_function_arguments_to_js to_js_cb,
                         napi_async_function_return_to_native to_native_cb,
                         size_t argc,
                         void** native_return_value);

This would cover the use case of creating a uv_async_t on the main event loop for the purposes of calling back into JS.

Notes:

Re https://github.com/nodejs/abi-stable-node/issues/281 Re https://github.com/nodejs/abi-stable-node/issues/279#issuecomment-336061708

mika-fischer commented 6 years ago

Yes, I think something like this is needed. This goes into the right direction, but I have some comments:

gabrielschulhof commented 6 years ago
  • I'm not sure if this API causes issues with the management of the lifetimes of the arguments. Maybe it would be nicer to manage lifetimes of the arguments and return values if we just had a context void* that gets passed to napi_call_async_function and then forwarded to napi_async_function_arguments_to_js and also napi_async_function_return_to_native. I see there's native_data but I don't get where it comes from...

Adding a void* for context makes sense.

native_data is the data to be converted to napi_values for recv and argv of napi_call_function().

  • Why does the number of arguments have to be established a priori? Is that the purpose of argc? I don't see a good reason for this limitation. argv in napi_async_function_arguments_to_js should be set to an array anyway, right? So we might as well leave it up to napi_async_function_arguments_to_js how many arguments are put in the array, no?

The number of arguments does not have to be established a priori. I offered that as an expansion to this API. We can add that from the start.

  • I see no need for native_return_value, I think we can leave it completely up to napi_async_function_return_to_native to decide what to do with the result, if anything. However, it would be useful to have a context void* in napi_async_function_return_to_native as well.

I forgot to mention that NULL can be passed for native_return_value if you don't care. napi_async_function_return_to_native can indeed ignore the return value.

  • I assume this will also use an internal queue of the calls? E.g. in the case that napi_call_async_function is called multiple times before the event loop gets around to executing the callbacks.

uv_async_t documents that calls from the native thread to the loop thread are not one-to-one. The native thread may call multiple times, but only one call will happen on the loop thread. Since napi_async_function is implemented using uv_async_t it will follow these semantics.

  • What happens to the calls already queued, when napi_delete_async_function is invoked?

I don't know. Probably, whatever uv_async_t does. We'll have to make sure there are no leaks.

  • Does the existence of a napi_async_function keep the event loop alive? Can it be unrefed?

It does keep the event loop alive. Whether this should be unrefed or not is something we'll have to decide.

gabrielschulhof commented 6 years ago

... or something for which we'll have to add a bool parameter to napi_create_async_function().

gabrielschulhof commented 6 years ago

@mika-fischer you're right about the return value. It's not even one-to-one, so the return value can't really be passed back to the thread, but it can at least be processed on the loop thread. So, here's the next version:

typedef napi_status (*napi_async_function_args_to_js_cb)(napi_env env,
                                                         void* data,
                                                         napi_value* recv,
                                                         napi_value** argv,
                                                         size_t* argc,
                                                         void* context);

typedef napi_status (*napi_async_function_process_return_cb)(napi_env env,
                                                             napi_value js_e,
                                                             napi_value js_ret,
                                                             void** native_ret,
                                                             void* context);

NAPI_EXTERN napi_status
napi_create_async_function(napi_env env,
                           napi_value function,
                           napi_async_function* result);

NAPI_EXTERN napi_status napi_delete_async_function(napi_env env,
                                                   napi_async_function async);

NAPI_EXTERN napi_status
napi_call_async_function(napi_async_function async_cb,
                         napi_async_function_args_to_js_cb to_js_cb,
                         napi_async_function_process_return_cb return_cb,
                         void* data,
                         void* context);
mika-fischer commented 6 years ago

uv_async_t documents that calls from the native thread to the loop thread are not one-to-one. The native thread may call multiple times, but only one call will happen on the loop thread. Since napi_async_function is implemented using uv_async_t it will follow these semantics.

Well then I'm afraid this thing does not make too much sense in my opinion...

So either provide the expected behavior of a function with napi_async_function or provide something more low-level, like napi_async, and make people implement the function semantics they need on top of that. At least that would then be possible to do without resorting to libuv.

Another possibility would be to make napi_async_function callable at most once. But this would be a significant limitation and it would be inconvenient to implement a proper function on top of that...

What happens to the calls already queued, when napi_delete_async_function is invoked?

I don't know. Probably, whatever uv_async_t does. We'll have to make sure there are no leaks.

As mentioned above, uv_async_t is very low-level and does not provide function call semantics. If we want function call semantics for napi_async_function, we need to specify what we do. In my implementation the queued calls are processed and then the async function is destroyed, which is probably what users would expect.

It would also be useful if the function could be deleted from the foreign thread. Deletion would be deferred until after all calls of the function have been processed. At the minimum it must be possible to call napi_delete_async_function from within napi_async_function_process_return_cb.

Other comments:

sam-github commented 6 years ago

I don't understand why we need napi API wrappers around uv. napi wraps the JS engine, to allow portability between node versions, and across node variants built against alternate JS engines, but it doesn't wrap the networking, async, dns, file system, threadpool, etc., etc. APIs from uv, and its not clear why it should.

The higher level C++ API has more convenience functions built above the n-api C API, and includes async support already: https://github.com/nodejs/node-addon-api/blob/5543c01f2d7739beb2ac9ae817d1c55e0c170e61/napi.h#L1438

addaleax commented 6 years ago

Also, since it was brought up here earlier: https://github.com/node-ffi-napi/get-uv-event-loop-napi-h provides a way to get the current uv_loop_t* for a given napi_env.

mika-fischer commented 6 years ago

@sam-github AsyncWorker is not applicable because it schedules work on the Node/libuv thread pool. Sometimes you need to call back into JS from a completely different thread.

For instance imagine wrapping a C function that blocks for a really long time. Sure you can start sacrificing threads from the thread pool for this, but what if you need to do this multiple times? Better to have a separate thread do the blocking call. But then how to call back into JS?

Whether that's out-of-scope for napi is not for me to decide. I'm just responding to questions about what things addon writers need to do, which can't be done with napi, yet. And this is one of them.

@addaleax I fail to see the connection, can you elaborate?

addaleax commented 6 years ago

@mika-fischer There’s no immediate connection to what you said, but the lack of facilities to get the right uv_loop_t* instance was brought up earlier in this thread (by the OP @gpean).

I do agree with @sam-github however; N-API has nothing to do with threading, it just provides access to Node and JS land. The napi_delete_*_work functions are utilities that should ideally be implementable in userland, on top of the rest of N-API, if that is desired.

But then how to call back into JS?

Using an uv_async_t + napi_makecallback sounds about right – is there any inherent issue with that?

mika-fischer commented 6 years ago

@addaleax There are no issues, I've got this running exactly like you suggest.

It's just that in my understanding one of the intended goals (if not the goal) of n-api is to "allow modules compiled for one version to run on later versions of Node.js without recompilation". And this is not possible if module writers need to call into libuv directly for basic stuff. And yes, I consider signaling back into JS from a foreign thread (and without blocking a worker thread or even worse the event loop) to be essential functionality. But maybe my reasoning is incorrect at some step. In that case, please do clarify things for me.

As mentioned above, maybe a more lightweight alternative could be to create a napi_async which just wraps uv_async_t. That way the more elaborate things could be built on top of that in the higher level C++ API or in third party modules. But the goal stated above would be achievable with napi_async, whereas it's not with uv_async_t.

addaleax commented 6 years ago

@mika-fischer So, essentially the point here is: Is the libuv API/ABI going to be stable enough to support addons using them for basic async stuff forever, or can we make it that? Does that sound about right to you?

And yes, I consider signaling back into JS from a foreign thread (and without blocking a worker thread or even worse the event loop) to be essential functionality.

Yeah, I agree.

As mentioned above, maybe a more lightweight alternative could be to create a napi_async which just wraps uv_async_t.

I think it would be nice if we could go with essentially having a locked uv_async_t API/ABI… for the API that should be doable, for the ABI I do see changes between libuv 1.x and libuv master.

Maybe what we really want is a set of libuv functions that work around that?

uv_async_t* uv_async_alloc(uv_loop_t*, uv_async_cb);
void uv_async_free(uv_async_t*);
/* similar for other handle/request types, or a reasonable subset of them */
void* uv_handle_get_data(uv_handle*);
void uv_handle_set_data(uv_handle*, void*);
/* similar for other fields */

I’d love to hear thoughts from @nodejs/libuv (aka @bnoordhuis @cjihrig @saghul @santigimeno ?) on this.

sam-github commented 6 years ago

The uv API&ABI has been kept very stable, and 1.x has been waiting for a while, but I'm not sure if/when it scheduled for merging. @cjihrig @bnoordhuis @saghul do you have any idea?

this is not possible if module writers need to call into libuv directly for basic stuff

It has been possible for a long time, uv 1.x will be the first break, as far as I know.

Creating a compat layer that would give API&ABI independency between current libuv and libuv 1.x seems to me a justifiable reason to add n-api support for uv APIs, but it would have to cover more than just async.