nodejs / abi-stable-node

Repository used by the Node-API team to manage work related to Node-API and node-addon-api
239 stars 47 forks source link

Weak/strong reference improvements #93

Closed jasongin closed 7 years ago

jasongin commented 7 years ago

The NAPI APIs for referencing values currently consist of the following:

napi_status napi_create_weakref(napi_env e, napi_value v, napi_weakref* result);
napi_status napi_release_weakref(napi_env e, napi_weakref w, napi_value* result);
napi_status napi_get_weakref_value(api_env e, napi_weakref w);

napi_status napi_create_persistent(napi_env e, napi_value v, napi_persistent* result);
napi_status napi_release_persistent(napi_env e, napi_persistent p);
napi_status napi_get_persistent_value(napi_env e, napi_persistent p, napi_value* result);

These APIs are missing two important capabilities:

  1. The ability to promote a weak reference to a strong reference, or demote a strong reference to a weak reference. (Currently this is possible in a roundabout way by retrieving the value and creating the other kind of reference, but it's very inconvenient.)
  2. The ability to attach a finalization callback to any (weak) reference, so that a custom cleanup procedure can be performed if/when the value gets GC'd. (Currently NAPI supports a finalizer callback only for "wrapped" native objects.)

I'm proposing a new unified API for NAPI value references, something like this:

// Set initial_refcount to 0 for a weak reference, >0 for a strong reference.
// The finalizer callback is optional.
napi_status napi_create_reference(napi_env e, napi_value value, int initial_refcount, napi_finalize cb, napi_ref* result);

// Adds a reference, optionally returning the resulting refcount. After this call the reference
// will be a strong reference because its refcount is >0, and the object is effectively "pinned".
// Calling this when the refcount is 0 and the object is unavailable results in an error.
napi_status napi_addref(napi_env e, napi_ref ref, int* result);

// Removes a reference, optionally returning the resulting refcount. If the result refcount is 0
// the reference is now weak and the object may be GC'd at any time if there are no other references.
// Calling this when the refcount is already 0 results in an error.
napi_status napi_release(napi_env e, napi_ref ref, int* result);

// Attempts to get a referenced value. If the reference is weak, the value might no longer
// be available, in that case the call is still successful but the result is NULL.
napi_status napi_get_reference(napi_env e, napi_ref ref, napi_value* result);

In V8 this proposed API should be straightforward to implement using a v8::Persistent and its SetWeak() and ClearWeak() methods. JSRT does not have exactly the same thing, but the equivalent can be built using a combination of JsCreateWeakReference() and JsAddRef()/JsRelease().

We could also consider an API that doesn't use a ref count, but just enables switching references between weak and strong. Then the C++ wrapper could manage a refcount on top of that. (Like Node's and Nan's ObjectWrap classes currently do.) But I think I favor having a refcount at the lower level.

Also, since this API provides a more general way to register a finalizer callback, we could remove the finalizer parameter from napi_wrap(). (Since you could achieve the same thing by creating a weak reference+finalizer on the wrapper object.) Or we might leave it there for convenience. Or we could replace that API with a simpler napi_set_external_data() (as previously suggested in comments in node_jsvmapi.h) and keep the "wrapping" logic at a higher layer.

Another common scenario when a finalizer callback might be used is for array-buffers of "external" data. I did not include a finalizer callback parameter in the napi_create_typedarray() API in my typed-arrays PRs because (in V8 at least) it's not possible to attach a finalizer callback directly to an object, only to a weak reference to it.

mhdawson commented 7 years ago

Given the assertion that the newly proposed API can be implemented in both V8 and Chakra and its needed for an existing module it looks good to me.

I think we should leave the finalizer in napi_wrap as I think you'll almost always want to do some cleanup when the object associated with the wrap goes out of scope.

mhdawson commented 7 years ago

My only worry is unrelated to the change itself. Its that we are still discovering new function that we have not covered yet. Might mean we need to port more modules before we believe the API is close to complete.

jasongin commented 7 years ago

I think we should leave the finalizer in napi_wrap as I think you'll almost always want to do some cleanup when the object associated with the wrap goes out of scope.

Actually I think now I'm favoring the 3rd alternative I mentioned above:

Or we could replace napi_wrap with a simpler napi_set_external_data() (as previously suggested in comments in node_jsvmapi.h) and keep the "wrapping" logic at a higher layer.

@mhdawson What do you think about that design?

mhdawson commented 7 years ago

I'll still thinking we want the equivalent of the wrap functionality. It should be possible to port a module using the C API so we still want to make that reasonable easy as well. The existing wrap also uses the internal Node.js implementation for wrap and I think we should avoid having 2 implementations (which of course could end up differing) of that.

jasongin commented 7 years ago

NAPI's object wrappers must have ref/unref functionality, which NAPI doesn't currently expose. I don't think it makes sense to create napi_wrap_ref() and napi_wrap_unref() APIs when really they are duplicative of the more general reference APIs. An "ObjectWrap" is basically just a ref-counted object with an external data pointer.

jasongin commented 7 years ago

I realized JSRT doesn't support weak reference callbacks. So that has to be removed from the proposed API above:

napi_status napi_create_reference(napi_env e,
                                  napi_value value,
                                  int initial_refcount,
                                  napi_ref* result);

But that's OK, because to complement the reference APIs, I'd suggest a more general concept of an "external" value with optional finalizer callback:

napi_status napi_create_external(napi_env e,
                                 void* data,
                                 napi_finalize finalize_cb,
                                 napi_value* result);
napi_status napi_is_external(napi_env e, napi_value v, bool* result);
napi_status napi_get_value_external(napi_env e, napi_value v, void** result);

Note a NAPI external value is not necessarily an object, so it cannot hold any additional properties. It's purpose is merely to be a wrapper around a void*.

Both V8 and ChakraCore support creating external values. In V8, the external pointer value is immutable; ChakraCore allows changing the value, but I don't think we need that. V8 reference: v8::External::Value() ChakraCore references: JsCreateExternalObject(), JsGetExternalData(), JsSetExternalData()

ChakraCore supports finalizer callbacks on external values. While V8 doesn't directly support it, that can still be easily achieved within NAPI by automatically creating a weak reference with callback if a finalizer callback is requested for an external.

Then, it is simple to build "ObjectWrap" functionality on top of these proposed reference and external APIs:

  1. To "wrap" an instance, create an external whose data points to the instance, and whose finalizer callback calls the instance destructor. Then create a reference to the external.
  2. Adjust the refcount on the reference as needed.
  3. To "unwrap" an instance, retrieve the external from the reference, then get the external's value.

Steps 1 and 3 would actually require two NAPI calls each using the APIs proposed above. To optimize for the object-wrap scenario, we additionally could keep napi_unwrap and napi_unwrap, or perhaps name them something like napi_create_external_reference and napi_get_reference_external to make it clear they are combinations of the other APIs.

mhdawson commented 7 years ago

As mentioned in the PR, I like that they are still called wrap/unwrap and it should be clear how to ref/unref due to the types of the return values.

pmuellr commented 7 years ago

Making weak vs strong a mutable aspect seems a bit confusing - I actually like the fixed aspect of the existing APIs - simpler for me to think about. In addition, the add/release style API seems a little out-of-place in terms of toggling the state. I can't quite think of a use case where that would make sense to me, but I'm sure I'm missing something.

I can easily believe that we might learn that we there are other patterns that very useful, over time, but it seems nicer to keep the API simple for now, and build other patterns on top of those. How complicated would a wrapper be for the new suggested APIs implemented using the existing napi ones?

jasongin commented 7 years ago

the add/release style API seems a little out-of-place in terms of toggling the state. I can't quite think of a use case where that would make sense to me, but I'm sure I'm missing something.

The node::ObjectWrap class is basically just a ref-counting wrapper around a v8::Persistent. Its Ref() and Unref() methods will make a strong reference (v8::Persistent::ClearWeak()) when the refcount is >0 and a weak reference (v8::Persistent::SetWeak()) when the refcount is 0. So, none of this is introducing any new concepts for native module developers.

I did suggest above that maybe the ref-counting part could be kept out of the low-level API. However, given that one of the most common uses of references (object wrapping) does require actual ref-counting, I think it probably does make sense to have it in this API.

Also, given that object-wrapping requires references that can be either weak or strong, if not for this convergence then NAPI would end up having three reference types: weak, strong, and mutable, all with similar APIs. I don't think that would be preferable over the single (mutable) reference type that is proposed here.

jasongin commented 7 years ago

As for the use case for refcounting, it is often encountered with native async callbacks. When native code goes off to do some async work, it may need to increment the refcount of the JS object that the work is being done on, to ensure the object is still alive when the async work is complete. If there are multiple async callbacks on the same object, the refcounting ensures the object is released only after all the work is complete. Alternatively, every async task could grab a separate persistent reference, but I'd imagine that would be less efficient.

ianwjhalliday commented 7 years ago

I don't know if the current APIs will port easily or at all to SpiderMonkey or duktape, but we were keeping them in mind when designing APIs, looking for the lowest common denominator possible. This would be a good time to consider impact on implementation with other VMs of the reference API model, now that you're looking at possibly changing the API.

jasongin commented 7 years ago

I understand, but this redesign isn't adding any new requirements on the VM. It is more of a refactoring of how the existing capabilities are exposed, to consolidate the common functionality into a single reference type. Object wrapping scenarios already required the VM to support weak and strong references. "Mutable" references aren't strictly required, since if necessary they could be simulated by changing the type of reference underneath.

jasongin commented 7 years ago

Closing since this work is done in all branches.

ceztko commented 1 year ago

Sorry for resuming this old discussion. Is it by any chance possible to attach a custom weak callback to a napi_value in Node? I am trying to simulate a full finalizer method that can access the object being GC'd but so far I just ended in napi_finalize callbacks where the object that I was trying to access was already inaccessible. I understand this is necessary in JS side to avoid messing with lifecycle of objects during collection, but using the native API I hope to being able to handle some more advanced scenarios. I found this reference that say that in V8 is possible to resurrect objects being GC'd but I haven't found actual code doing so nor I know if this would be doable with stable ABI NAPI.

legendecas commented 1 year ago

I don't think the link you referenced is suggesting that you can resurrect objects being GC'ed. V8's API contract requires a weak callback to reset the persistent handle immediately (https://source.chromium.org/chromium/chromium/src/+/main:v8/include/v8-persistent-handle.h;l=150?q=SetWeak) and the object referenced by the handle is already inaccessible.

ceztko commented 1 year ago

@legendecas ok, let's clarify what I am trying to do: I'm in a transpilation scenario where I want to transpile C# into JavaScript. C# has fully fledged finalizers that can access the instance of the object being finalized. It can also resurrect objects that are being finalized, but that's something I an not interested in supporting. The question here is if I can somehow run a JS defined "finalizer" method on the instance of the object being selected for collection, using the NodeJS native public/internal API, without recurring to a separate object state and allowing me to seamlessly transpile C# code, or if I'm completely out of luck.

legendecas commented 1 year ago

If I understand the problem correctly, I'm afraid that the semantics difference is hard to be aligned as the C# finalizer allows reference to the object being finalized, whilst neither ECMAScript FinalizationRegistry nor V8's embedded API allows accessing the finalized object in the finalizer.

Node-API requires the feature to be widely supported by various JS engines, so I don't think such behavior would be supported by Node-API too.

ceztko commented 1 year ago

@legendecas: I understand. While I have not tried it I believe the V8 API leaves room to register a callback to a v8::global<v8::value> that fetches a v8::local and then register a second pass callback that will run JS code on that local before structures are actually freed, or at least I could not find anything to prevent that by just reading the code, but I may be wrong. Does Node really works with anything other than V8 or you are assuming it may happen in the future? I also corrected the reference to the guy that states that object resurrection is possible in V8. Of course if it's not substantiated by code it's not a solid one.

ceztko commented 1 year ago

Answering myself: resurrecting finalizers were deprecated and have been already removed. I'm definitely out of luck.