dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.21k stars 1.57k forks source link

[vm/ffi] `TypedData`, `Pointer`, multiple isolates, and releasing native resources #54885

Open dcharkes opened 8 months ago

dcharkes commented 8 months ago

Pointers can be converted into external TypedData with the asTypedList.

When sending messages to other isolates:

This means that if a user calls asTypedData first, and then sends to another isolate, all native resources are cleaned up correctly both in the case the message is delivered as well as when the message is not delivered. We're assuming Dart takes ownership of the native memory and attaches a finalizer to free the buffer once it's done using it by calling asTypedList with a finalizer immediately on receiving the pointer from native code.

If the user sends the Pointer to another isolate and converts it to a typed-data on the other end, attaching a finalizer there, the native resource does not get finalized if the message never arrives. But it is faster, because no copies of the backing buffer are made. (The user could start sending messages back and forth between both isolates to acknowledge ownership is transferred from one isolate to the other, but that leads to the byzantine generals problem.)

A third strategy is to use a reference counter in native code and let every isolate call the finalizer. However, this is racy, because the source isolate could GC it's reference to the native resource before a reference on the receiving isolate is created increasing the reference count. Then you'd want to increase the reference account before sending, but that would lead to leaked resources if the message never arrives.

I think we have multiple things we should address here.

  1. We should document the difference between sending Pointers to other isolates and sending TypedDatas created from pointers to other isolates.
  2. We could consider adding NativeFinalizers to sending messages that get run if the message is not delivered. That would enable users to implement the refcounter properly.
  3. We could consider introducing something like a IsolateGroupFinalizable which allows one to attach a NativeFinalizer which will run once every copy of the IsolateGroupFinalizable object has been GCed. This would avoid the need for users to implement their own refcounter. (I feel like we discussed this before, but I can't find the relevant GitHub issue.)

cc @mkustermann @mraleph

I think this is a rather common question that has to be dealt with as soon as somewhat longer FFIcalls are involved that users would like to run on a helper isolate. @craiglabenz

dcharkes commented 8 months ago

Notes from discussion with @mkustermann:

@mraleph This sounds related to Sharables. But likely not all Sharables need NativeFinalizers attached. So, I don't think we should conflate the two concepts. Hence I will go with a different name for now.

dcharkes commented 8 months ago

Some more observations together with @mkustermann:

dcharkes commented 8 months ago

Looking at what the VM currently marks as Immutable, some of the types are not marked sealed.

abstract interface class RegExp implements Pattern {

abstract interface class SendPort implements Capability {

abstract interface class Capability {

This means that objects of these types are immutable, but any subtypes are not. Consequently, this means transitive immutability cannot be inferred by looking at field-types statically.

@lrhn Do we have some documentation somewhere about RegExp~, SendPort, and Capability~ of why we allow implementing them? ~Should we consider marking them final so that only the dart: libraries can produce instances of these types?~ Edit: found some documentation https://dart-review.googlesource.com/c/sdk/+/288201, SendPort is implemented in some places.

lrhn commented 8 months ago

It's probably that SendPort is implemented in precisely one place, package:isolate, written by me, and which is no longer officially maintained.

If we really want to make SendPort a final class, it's not impossible that we could.

RegExp is likely the same story, there might be implementations. Not absolutely sure, but I know I have written something like ConstRegExp as an example, and someone might have chosen to use it.

I don't think a "deeply immutable" marker interface, which subclasses must then satisfy, is a good idea in general. Having it for specific native-twisted types is probably fine, and having it on final classes doesn't affect anybody else.

Being deeply immutable is an implementation detail, not a property of a type. It's very likely that RegExp is not actually immutable, it can easily have done caching. I know the web implementation does.

dcharkes commented 8 months ago

Thanks @lrhn !

It's probably that SendPort is implemented in precisely one place, package:isolate, written by me, and which is no longer officially maintained.

If we really want to make SendPort a final class, it's not impossible that we could.

RegExp is likely the same story, there might be implementations. Not absolutely sure, but I know I have written something like ConstRegExp as an example, and someone might have chosen to use it.

Okay, that's good to know for the future if we come up with use cases.

For now I will just disallow fields typed Capability, SendPort, and RegExp in deeply immutable objects.

I don't think a "deeply immutable" marker interface, which subclasses must then satisfy, is a good idea in general. Having it for specific native-twisted types is probably fine, and having it on final classes doesn't affect anybody else.

Being deeply immutable is an implementation detail, not a property of a type. It's very likely that RegExp is not actually immutable, it can easily have done caching. I know the web implementation does.

I have gone for a pragma so far. And indeed it would be quite ugly to have to add it to double/int/bool/null etc. So let's stick to a pragma.

I presume we say that behavior only observable through Finalizers or NativeFinalizers is considered an "implementation detail" due to the fact that GC is considered an implementation detail from a Dart-language-POV. (Note that such implementation details are very important when it comes to interop with native code and releasing native resources. Which is an interop-POV and the reason for this GitHub issue.)