dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.19k stars 1.57k forks source link

[vm] Allocate-copy-fill primitive for `_List` allocation #44389

Open rakudrama opened 3 years ago

rakudrama commented 3 years ago

The VM CreateArray instruction allocates a null-filled array. The elements of the array are then often overwritten. This is what happens for the List.of path copying from a system list, or in List.filled.

It would be more efficient to combine these into an allocation primitive that can initialize via copy and fill.

class _List<E> ... {
12345678901234567890123456789012345678901234567890123456789012345678901234567890
  // Allocate a `_List` object with length [length], initialized with elements
  // from [source] between [sourceStart] (inclusive) and [sourceEnd], exclusive.
  // Elements not initialized from [source] are initialized with [fillValue].
  //
  // There is minimal argument validation. If [sourceEnd] is zero, [source] and
  // [sourceStart] are ignored.
  _List.allocateCopyFill(int length, _List? source, int sourceStart, int sourceEnd, fillValue);
}

We might want fillValue to be untyped to allow null- or zero- initialization of a backing store.

This primitive can be used for many things:

   _List.filled(int length, E fill) => _List<E>.alloacteCopyFill(length, null, 0, 0, fill);

   List<E> _toFixedList() => _List<E>.allocateCopyFill(this.length, this, 0, this.length, 0);

The implementation of the primitive can do the initialization loop without write barriers and safe-points. I think this could be 5x faster than the current code. It can use hinted write instructions when copying / filling large arrays to control cache pollution.

It might be nice to expose a checked version at the SDK level to assist with growing backing stores in non-system libraries.

rakudrama commented 2 years ago

I played around a little with a similar primitive. It is definitely faster for long lists, but it is hard to get as-good performance for small lists, which are quite common.