dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.22k stars 1.57k forks source link

[vm/ffi] Generic accessors for custom container types #52481

Open ds84182 opened 1 year ago

ds84182 commented 1 year ago

After spending hours trying and failing to implement an array type across thousands of auto-generated structs, I give up! Instead I'm writing yet-another-feature-request (sorry)

// Some marker classes that provide capabilities.

// Marker class for an instance that contains a pointer.
abstract interface class AsPointer<T extends NativeType> {}

// Marker annotation declaring the getter or field for the pointer.
// Can be private, preventing library users from getting access to the underlying pointer.
// Classes that implement AsPointer<T> must have a SINGLE getter annotated with this.
// The getter must return a Pointer<T> or a (Pointer<T>, int).
// The latter denotes the starting element offset and is equivalent to `ptr.elementAt(record.$2).<accessor>`
const innerPointer = /* ... */;

// Marker annotation used on a `(int index) -> void` method to verify the given index before
// accessing the pointer. Can be used for bounds checking.
const verifyIndex = /* ... */;

// Marker interfaces that denote the allowed capabilities.
abstract interface class ValuePointer<T extends NativeType> implements AsPointer<T> {}
abstract interface class ValueSetPointer<T extends NativeType> implements AsPointer<T> {}
abstract interface class IndexPointer<T extends NativeType> implements AsPointer<T> {}
abstract interface class IndexSetPointer<T extends NativeType> implements AsPointer<T> {}

// Getters and setters for the exposed capabilities on each NativeType.
// Internally desugars to an accessor on `instance.<innerPointer getter>`.
// Cannot be used dynamically, but the innerPointer getter can be dynamic.
extension Int8ValuePointer on ValuePointer<Int8> {
  external int get value;
}

extension Int8ValueSetPointer on ValueSetPointer<Int8> {
  external void set value(int value);
}

extension Int8IndexPointer on IndexPointer<Int8> {
  external int operator[](int index);
}

extension Int8IndexSetPointer on IndexSetPointer<Int8> {
  external void operator []=(int index, int value);
}

// ...

This is designed to avoid exposing the underlying pointer, which may be dangerous to do. It also separates indexing and mutability.

As an example, here's a vector type.

final class Vec<T extends NativeType> implements IndexPointer<T>, IndexSetPointer<T> {
  @innerPointer
  Pointer<T> _ptr;
  int _length = 0;
  int _capacity = 0;

  Vec([int capacity = 0]) {
    if (capacity > 0) _reallocate(capacity);
  }

  int get length => _length;

  @verifyIndex
  void _checkIndex(int index) => RangeError.checkIndex(index, length);

  // Still no way to implement pushBack. But we can do something weirder
  Emplacer<T> get push => /* returns a write-only pointer that pushes to the back of the vector */;

  void _reallocate(int capacity) { /* unsolved problem: how do we get the size and alignment of T? */ }

  // operator[] and operator[]= are implemented via dart:ffi thanks to marker interfaces
}

final class Emplacer<T extends NativeType> implements ValueSetPointer<T> {
  final Vec<T> _vec;
  @innerPointer
  (Pointer<T>, int) get _ptr => /* return index of next value in vector storage, reallocating if needed */;

  // Value setter implemented via dart:ffi
}
final myNumbers = Vec<Int32>();
myNumbers.push
  ..value = 0
  ..value = 1
  ..value = 2
  ..value = 3;
expect(myNumber.length, 4);
expect(myNumbers[1], 1);

It's certainly better than nothing. Because right now, pain.

mkustermann commented 1 year ago

Dart's FFI has been intentionally designed in a way that allows us to optimize the code to make it as efficient as in C (haven't done all optimizations yet, but we can do them).

More specifically, it means code has to specify the concrete (non-generic) type it's operating on: for reading memory, writing memory, calling sizeOf<...>() - so those can be directly translated into memory loads/stores (or constants in case of sizeOf<...>()).

What you're trying to do, making a generic growable array/vector implementation over arbitrary native types, would not be possible in C. (It would be possible in C++ when utilizing templates - which a C++ compiler will just code-duplicate for the template parameters. Though Dart generics work very differently from C++ templates - they don't use code duplication, so they cannot be used.)

So with the current FFI, we can think how would we implement a generic container implementation in C and then translate that to Dart. C code may look like this

// Define `vector_t` as our vector type

vector_t* vector_new(intptr_t element_size);
void vector_push(void* value);
void* vector_get(intptr_t index);
void vector_set(intptr_t index, void* value);
void vector_free(vector_t*);

// Use as this

struct MyStruct { ... }
vector_t* v = vector_new(sizeof(MyStruct)); 
MyStruct value;
vector_push(&value);
MyStruct* value2 = (MyStruct*)vector_get(0);
...
vector_free(v);

One could translate this into Dart

class Vector<T> {
  final int elementSize;

  // Dynamically allocated & grown.
  Pointer<Void> memory;
  int length;
  int allocatedSize;

  Vector(this.elementSize) { ... }

  Pointer<T> operator[](int index) => Pointer<T>.fromAddress(memory.address + elementSize*index);

  void operator[](Pointer<T> value)  {
    memcpy(memory.address + elementSize*index, value.address, elementSize);
  }
}

main() {
  final vector = Vector<MyStruct>(sizeOf<MyStruct>());
  vector.push(...);
  vector[0];
}

The downside is that it will take pointers to elements and return pointers to elements rather than have value/copy-semantics. That makes in particular insertions less ideal as one would need to e.g. have a global Pointer<MyStruct> that one can initialize and then give to Vector:


final box = allocate<MyStruct>(sizeOf<MyStruct>());

main() {
  final vector = Vector<MyStruct>(sizeOf<MyStruct>());
  initialize(vector);
  read(vector);
}
void foo(Vector<MyStruct> vector) {
  box.ref
    ..fieldA = ...
    ..fieldB = ...;
  vector.push(box);
}
void read(Vector<MyStruct> vector) {
  MyStruct struct = vector[0].ref;
  ...
}

To avoid this for built-in types / primitives one could make specialized versions e.g. VectorInt8 instead of Vector<Int8>.

Would that work for you (even it's a little less convenient than in C++)?

ds84182 commented 1 year ago

While it works, I'd really like to avoid exposing the underlying pointer to elements in the container. The part I'm struggling with is making WinRT & Win2D bindings appear as the high-level API it already is without accruing heaps of simple objects in memory. In C# the runtime's array of structs are C-compatible which avoids this dreaded overhead for otherwise simple types like Vector2.

For example, take CanvasDrawingSession.DrawGlyphRun. CanvasGlyphs has 3 float fields and 1 int field. In an ideal world I'd be able to wrap this with a List interface that optimizes down to writing struct fields without actually exposing a pointer to native memory. But to even get something close to this, I have to generate multiple classes & extensions per struct. And even then, the ergonomics still aren't great.

At its root, I'm struggling to make myListOfVector2[i].X += 3.0 feel like you're interacting with a Dart class without also incurring the overhead of a List<Vector2>. And without generating tons of top-level members & exposing raw pointers so mixins can do the translation between native and Dart.