dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.23k stars 1.57k forks source link

[vm/ffi] `Pointer.asArray` or `Array.fromPointer` #54526

Open dcharkes opened 10 months ago

dcharkes commented 10 months ago

In some cases it might be useful to create an Array out of a Pointer.

We could add extension methods to Pointer:

extension StructPointer<T extends Struct> on Pointer<T> {
  external Array<T> asArray<T>(int dimension1,
      [int dimension2,
      int dimension3,
      int dimension4,
      int dimension5]);

  external Array<T> asArrayMulti<T>(List<int> dimensions);
}

// Ditto other extensions.

This would be kind of similar to the asTypedList methods, but would work on anything that is sized, including structs, unions, nested arrays.

Alternatively, we could add a factories to Array. That would make it more similar to constructors that take a typed data:

(Note that if the above issue is addressed, a workaround for struct pointers could be Array<...>.fromTypedData(pointer.cast<Uint8>().asTypedList(sizeOf<MyStruct>() * length), dimensions...)

Workaround for compile-time length arrays

Define a wrapper struct, cast the pointer to that, and read the array out:

main() {
  final Pointer<MyStruct> p = ...; // which has 10 elements at compile-time
  final Array<MyStruct> a = p.cast<HelperStruct>().ref.inlineArray;
}

class HelperStruct extends Struct {
  @Array(10)
  external Array<Struct> inlineArray;
}
mkustermann commented 10 months ago

Any pointer can be seen as an array (array is just a pointer and a known size). Why is this not

extension ArrayPointer<T extends NativeType> on Pointer<T> {

  external Array<T> asArray(int dim);

  // Requires H to have `dims.length` nested `Array`s with innermost argument being `T`
  external H asMultiArray<H extends Array>(List<int> dims);
}

to allow

Pointer<Float32> floats;

Array<Float32> vector = floats.asArray(4);
Array<Array<Float32>> matrix = floats.asMultiArray<Array<Array<Float32>>>([4, 4]);

The n-dimensional part is a little weird, because one has to explicitly write the array types as the static type system doesn't look at the List<int> and make the return type have automatically that many nested Arrays.

One could also consider making this something like this instead:

extension ArrayPointer<T extends NativeType> on Pointer<T> {
  external Array<T> asArray(int n);
}
extension ArrayOfArray<T extends NativeType> on Array<T> {
  external Array<Array<T>> split(int n);
}

to allow

Pointer<Float32> floats;

Array<Float32> vector = floats.asArray(4);
Array<Array<Float32>> matrix = floats.asArray(16).split(4); // 4x4 matrix
Array<Array<Array<Float32>>> matrix = floats.asArray(16).split(4).split(2); // 4x2x2
dcharkes commented 10 months ago

Any pointer can be seen as an array (array is just a pointer and a known size). Why is this not

extension ArrayPointer<T extends NativeType> on Pointer<T> {

  external Array<T> asArray(int dim);

  // Requires H to have `dims.length` nested `Array`s with innermost argument being `T`
  external H asMultiArray<H extends Array>(List<int> dims);
}

to allow

Pointer<Float32> floats;

Array<Float32> vector = floats.asArray(4);
Array<Array<Float32>> matrix = floats.asMultiArray<Array<Array<Float32>>>([4, 4]);

Ah right, yes I mixed up the types for multi-dimensional arrays.

I think we want to add extension methods in the extensions that are for fixed-size types. To avoid having to write special static checks for Pointer<Void>.asArray() and Pointer<T extends Opaque>.asArray().

The n-dimensional part is a little weird, because one has to explicitly write the array types as the static type system doesn't look at the List<int> and make the return type have automatically that many nested Arrays.

One could also consider making this something like this instead:

extension ArrayPointer<T extends NativeType> on Pointer<T> {
  external Array<T> asArray(int n);
}
extension ArrayOfArray<T extends NativeType> on Array<T> {
  external Array<Array<T>> split(int n);
}

to allow

Pointer<Float32> floats;

Array<Float32> vector = floats.asArray(4);
Array<Array<Float32>> matrix = floats.asArray(16).split(4); // 4x4 matrix
Array<Array<Array<Float32>>> matrix = floats.asArray(16).split(4).split(2); // 4x2x2

That could work, but does split split the innermost or outermost dimension? 😅

I think it would make slightly more sense to split the outer dimension, because the outermost Array type represents the outermost dimension.

Pointer<Float32> floats;

Array<Float32> vector = floats.asArray(4);
Array<Array<Float32>> matrix = floats.asArray(16).split(4); // 4x4 matrix
Array<Array<Array<Float32>>> matrix2 = floats.asArray(16).split(4).split(2); // 2x2x4
final innerMatrix = matrix[0].split(2); // 2x2 this would split the innermost dimension (after dereferencing the outer dimension)

But then doing splits in reverse order because you're always acting on the outermost dimension feels somewhat weird.

Maybe the documentation should just mention "innermost" in bold, idk what the right solution is here. @lrhn to the rescue!

It would definitely help not having to retype type arguments.

mkustermann commented 10 months ago

I think we want to add extension methods in the extensions that are for fixed-size types. To avoid having to write special static checks for Pointer.asArray() and Pointer.asArray().

Good point. Would an intermediary super type (that Opaque and Void don't inherit from), e.g. _AllocatableNativeType avoid repeating those for all the other subclasses?

dcharkes commented 10 months ago

I think we want to add extension methods in the extensions that are for fixed-size types. To avoid having to write special static checks for Pointer.asArray() and Pointer.asArray().

Good point. Would an intermediary super type (that Opaque and Void don't inherit from), e.g. _AllocatableNativeType avoid repeating those for all the other subclasses?

Maybe a marker type instead of a super type, that could also serve as documentation.

lrhn commented 9 months ago

A type that Void and Opaque doesn't implement also sounds like the argument type of sizeof, so maybe it should be public. ConcreteNative? NativeValue? (Don't know what Opaque means, so can't really say what it's not.) FixedSizedNative.

As for split, I'd split the outer array. That's the one that the extension is matching on, so it's the only one mentioned in the signature.

I know Array<Array<Int32>> split on the inner and outer arrays will have the same Dart type, but it's more consistent to treat the inner tie and value generically (treat it the same, no matter what it is), and only work on the outer value. That avoids needing to destructure the type parameter type, to see if it's an Array, which isn't something you can do on current Dart.

dcharkes commented 9 months ago

A type that Void and Opaque doesn't implement also sounds like the argument type of sizeof, so maybe it should be public. ConcreteNative? NativeValue? (Don't know what Opaque means, so can't really say what it's not.) FixedSizedNative.

Yes, exactly, it should be public. And yes, we should use it for sizeOf.

I was thinking of SizedNative. It's not really fixed size, because it's a known size per ABI. It always has a size, so "sized".

lrhn commented 9 months ago

SizedNative is positive, precise, works. Also allows restricting to the type.

The only thing that worries me is that almost every type is Sized, so we might want to have a name for the concept of not being sized. (But it's still better to have a SizedNative marker interface, so you can do sizeOf<T extends SizedNative>().

dcharkes commented 9 months ago

UnsizedNative, but we wouldn't use it for anything.

dcharkes commented 9 months ago

(Maybe it need to be SizeNativeType instead. A native refers to @Natives already.)

This means we will need to a breaking change for Allocator.allocate, it's bound needs to be updated from T extends NativeType to T extends SizedNativeType.

We'd first need to land SizedNativeType, and later update the bound on Allocator.allocate. Edit: We can't seem to tighten the type argument bound, overrides are invalid. Neither can we update call sites first, because then the call sites would be more strict than the super type.

third_party/pkg/ffi/lib/src/arena.dart:50:14: Error: Declared bound 'NativeType' of type variable 'T' of 'Arena.allocate' doesn't match the bound 'SizedNative' on overridden method 'Allocator.allocate'.
 - 'NativeType' is from 'dart:ffi'.
 - 'SizedNative' is from 'dart:ffi'.
  Pointer<T> allocate<T extends NativeType>(int byteCount, {int? alignment}) {
             ^
org-dartlang-sdk:///sdk/lib/ffi/allocation.dart:86:14: Context: This is the overridden method ('allocate').
  Pointer<T> allocate<T extends SizedNativeType>(int byteCount, {int? alignment});
             ^
third_party/pkg/ffi/lib/src/arena.dart:50:14: Error: Declared bound 'SizedNativeType' of type variable 'T' of 'Arena.allocate' doesn't match the bound 'NativeType' on overridden method 'Allocator.allocate'.
 - 'SizedNativeType' is from 'dart:ffi'.
 - 'NativeType' is from 'dart:ffi'.
  Pointer<T> allocate<T extends SizedNativeType>(int byteCount,
             ^
sdk/lib/ffi/allocation.dart:89:14: Context: This is the overridden method ('allocate').
  Pointer<T> allocate<T extends NativeType>(int byteCount, {int? alignment});
             ^

I don't believe we have a way to indicate that we want to tighten a type argument bound on a method with a @Deprecated message. @Deprecated only really enables us to introduce a new method with a different name.

dcharkes commented 9 months ago

Quick mock up of changes for SizedNativeType https://dart-review.googlesource.com/c/sdk/+/345221

@lrhn any suggestions for doing the breaking change to Allocator.allocate?

I'm assuming we have users implementing the Allocator interface out there. It would be quite ugly to introduce an allocate2 with the new bound and deprecate the old allocate, then wait a Dart stable release, and have packages do an upper bound Dart SDK constraint, then introduce allocate with the new type bound and deprecate the allocate2 and then wait another stable release and then delete allocate2. Hopefully users of Allocator implementations just use the .call method. Another cheat way to migrate is to rewrite the type argument bound in the CFE. We already give an error anyway on calling it with the wrong NativeType.

mkustermann commented 9 months ago

This means we will need to a breaking change for Allocator.allocate, it's bound needs to be updated from T extends NativeType to T extends SizedNativeType.

Why? One cannot allocate<Opaque>() today either, can we? We'd replace some special checks in FFI type checker with a static type check by the CFE/language.

dcharkes commented 9 months ago

This means we will need to a breaking change for Allocator.allocate, it's bound needs to be updated from T extends NativeType to T extends SizedNativeType.

Why? One cannot allocate<Opaque>() today either, can we? We'd replace some special checks in FFI type checker with a static type check by the CFE/language.

The implementors of Allocator need to update the bound on allocate. (Including our own package:ffi.)

mkustermann commented 9 months ago

The implementors of Allocator need to update the bound on allocate. (Including our own package:ffi.)

Strictly speaking, mainly the higher level abstractions that work on the types need to be updated, i.e. extension AllocatorAlloc on Allocator to make allocate<Foo>() work and disallow allocate<Utf8>().

The Allocator.allocate methods are allocating bytes, one could still allow allocator.allocate<Utf8>(10).

lrhn commented 9 months ago

Ack. Type parameter bounds are invariant, so any change to only one of an override-pair will be breaking. You'll need to update all of them at the same time.

Allocator is in dart:ffi, right? If it had been in package:ffi, you could technically make a breaking change update, major version increment, where the bound becomes SizedNativeType, and then everybody will have to change their code if they want to use the new version.

Keeping NativeType as being the sized type, and introducing an unsized supertype, say AbstractNativeType, is bad naming, and requires you to change every other places where you currently use NativeType and want to accept Void, to AbstractNativeType. That'll probably be at least as breaking (in parameter position, using NativeType where the superclass uses the supertype AbstractNativeType is an error, so that code would have to change instead).

You could introduce a typedef AllocatableNativeType = NativeType; now, force everybody to change their Allocator.allocate to Allocator.allocate<T extends AllocatableNativeType> (somehow), and then, as a schduled and duely notified breaking change, change AllocatableNativeType to SizedNativeType,

It's a breaking change to split a type into two, and start restricting to only one of them in places where you previously accepted all. The best approach is to do it as a proper breaking change. Not necessarily the easiest, but it'll likely have the best final outcome, where everything is again coherent.

EDIT: Or if @mkustermann is right, don't necessarily change Allocator, if it doesn't rely on sizeOf.