dart-lang / site-www

Source for Dart website
https://dart.dev
Other
969 stars 700 forks source link

Discuss performance implications of generics in language documentation #4998

Open modulovalue opened 1 year ago

modulovalue commented 1 year ago

Page URL

https://dart.dev/language/generics.html

Page source

https://github.com/dart-lang/site-www/tree/main/src/language/generics.md

Describe the problem

The "Note" under https://dart.dev/language/generics#generic-collections-and-the-types-they-contain explains that generics are reified, but it doesn't mention anything about the impact that this has on the runtime performance of generics. Rust, for example, appears to monomorphize everything and Java appears to monomorphize nothing.

Expected fix

It might be useful to mention how well generics perform during runtime. That is, is there always some overhead associated with generics? Are there ways to (automatically) get rid of any overhead for the cost of larger binaries? (Cf. https://github.com/dart-lang/sdk/issues/52722)

Additional context

No response

parlough commented 1 year ago

Thanks for opening an issue! I'm not too familiar with this, so I would like insight and thoughts from a few others.

\cc @eernstg @mraleph For some insight. Is there something that's worth documenting here? I'm assuming each platform/compiler is potentially different, limiting the scope of what is worth documenting, not sure though.

eernstg commented 1 year ago

One thing to note is that there is no way a Dart implementation could compute all type arguments at compile time:

Object peanoList<X>(int i) {
  if (i <= 1) return <X>[];
  return peanoList<List<X>>(i - 1);
}

void main() {
  print(peanoList<Never>(3).runtimeType); // 'List<List<List<Never>>>'.
}

This example shows that the actual type argument of the value returned by peanoList(i) depends on the value of the int value argument i which is passed to the invocation, and that's obviously not a statically decidable property.

The fact that Rust performs monomorphization of all generic types implies that this is indeed possible in Rust, presumably because of the very restricted notion of subtyping that Rust uses.

In Dart, we need to detect the subset of program locations where compile-time monomorphization is even possible. The example above shows that we need to detect the existence of peanoList in order to know that the class List cannot be monomorphized completely. It is certainly still possible to create a number of monomorphic subtypes of List<T> for specific T (we already have classes like Int8List with that property).

But even those classes will still have to satisfy the requirements for Dart run-time entities. For instance Int8List.fromList([1]) is List<num> must still evaluate to true. This implies that the code which is generated for is List<num> needs to be able to treat an Int8List as if it had been an instance of a generic class, which again probably implies that the Int8List must have a representation in memory where the type argument value int can be looked up, just like it would be looked up in the value of <int>[] for <int>[] is List<num>.

In other words, some parts of the code for such "monomorphic types" might be generated as if there was never any type parameters in that class, but other parts would have to preserve the behaviors of a generic class instance, presumably preserving such things as the representation of the actual type argument in the object layout.

I suspect that this difference in the ability to monomorphize Dart programs compared to Rust programs is the other side of a specific coin: Dart supports run-time type tests like switch (myList) { List<num>() => something, ... }. This page explains how an attempt to recover the type of an object which isn't known at compile time needs to rely on the type Any and/or various pieces of unsafe code.

I think it's fair to say that Dart is inherently more dynamic than Rust (and that doesn't mean 'is less type safe', it means 'supports a wider range of operations at run time'), and that's the reason why it is not practical (or even desirable) to ask for Dart to be a lot more like Rust in this respect.

modulovalue commented 1 year ago

Thank you, Erik.

I did not mean to imply that I want Dart to be more like Rust (I enjoy Dart much more than Rust :) and, apparently, Rust borrowed this feature from C++!).

My primary motivation here is to find out if there is any overhead associated with generics (when compared to cloning a generic construct, and replacing the generic types with concrete types), and if there is any, whether it is possible to avoid it or not.

Given your explanation, it seems pretty clear to me, that it is not safe to assume that monomorphization is generally being applied as of right now.


In a linked issue, a suggestion was made that mixins may be able to be used to monomorphize generic classes. But this, of course, wouldn't apply to the more general case that this issue deals with.


For future reference:

So if the first point in this list could work, then maybe there is a distant future where monomorphization is a thing.

eernstg commented 1 year ago

Thanks for coming up with topics like this! Monomorphization is a complex topic, and it's great if we can shed some light on it!

find out if there is any overhead associated with generics

You mentioned C++, and that is an early and important example of using a compile-time-only approach to type parameterization (and more — C++ templates can accept value arguments as well as type arguments). In the C++ community the most prominent issue with this approach has probably been code bloat: If every usage of a type parameterized entity E gives rise to a expansion (some E1 which is a copy of E where every occurrence of each type parameter has been replaced by the actual type argument) then you need to use a smart linker in order to avoid having dozens of copies of the same code (because more than one location in the code uses a sequence of int values). I think this problem has largely been dealt with today, but it did take decades to get there.

Still, if you generate a few hundred copies of a List class in order to enable lists of a few hundred kinds of objects (and similarly for any other generic class), the performance of the program may still go down: Startup times could be longer if we need to download a much larger program than we would have with the original single generic List class, and we might need to swap memory regions in and out of the main memory when the amount of code is much larger.

Hence, I don't think there is a global truth which goes like "monomorphized is faster". It would be more like a delicate optimization game for any particular program, and possibly even for any particular usage pattern for each such program.

The runtimeType example is actually a hint with the opposite polarity: The main reason why an invocation of runtimeType can be globally costly is that it forces more type information to be available at run time, in particular: the value of actual type arguments. The important point here is that a monomorphized copy of a generic declaration includes the type arguments in their "inlined" form all over the declaration, so we don't eliminate the type parameter, instead it's baked in with each monomorphic copy of the generic declarations.

In contrast, the improved speed/size properties of the program where runtimeType has not been used is the improved size/speed properties of the generic entity: If no part of the program actually needs to use the value of T at run-time then class C<T> ... can be compiled as class C ..., and this "stripped" version of C can then be reused with all the different values of T that the program uses.

I think this makes a particularly big difference for dart2js because it is possible to compile programs such that they do not perform dynamic type checks for non-covariant occurrences of their type parameters. (For example, if myList.add(e) is type correct at compile time then it will succeed at run time, no matter whether the dynamic type of e is actually a subtype of the type argument of myList. The point is that if there is a type problem then we'll discover it later on when that object is actually used, and we don't want to pay for all those early type checks in deployed applications.)

We could create a monomorphic subclass of any given generic class, and that could be a way to perform experiments with a very simple kind of monomorphization:

mixin NoOp {}

class C<X> {...}

// We want a monomorphic copy of `C` at `int`, and we even want to know
// that no members have been overridden.
final class C_int = C<int> with NoOp;
modulovalue commented 1 year ago

Thank you for pointing this out, because it really is not obvious: the assumption that zero-cost abstractions are completely zero-cost made some C++ files (at Google) take over 15 minutes to compile.

The way that I phrased this issue could seem like Dart is worse, but, as you said, and I completely agree, it's a trade-off, and even Rust doesn't seem to have solved all the code bloat problems.

I did include a little more nuance in https://github.com/dart-lang/sdk/issues/52722 in the way that I phrased this issue, where I referred to monomorphization as a "dial" that can be selectively turned. Your example appears to be very helpful in that regard and I will try to see if I can find it to be helpful in practice.

@parlough if you feel like it wouldn't help people to discuss this in more the depth in the docs, then feel free to close this issue. I'm not sure if there even is a good way to cover this topic (in the docs) in a way that is helpful and wouldn't unnecessarily confuse people?

eernstg commented 1 year ago

Interesting!

Here's a tiny experiment:

const numberOfIterations = 100000000;

mixin NoOp {}

class A<X> {
  X? x;
  String foo() => x.toString();
}

final class Aint = A<int> with NoOp;

void main() {
  var timer = Stopwatch();

  void start() => timer.start();

  void stop(String s) {
    timer.stop();
    print('$s: ${timer.elapsedMilliseconds}');
    timer.reset();
  }

  var aPlain = A<int>();
  var aMono = Aint();

  start();
  for (var i = 0; i < numberOfIterations; ++i) {
    aPlain.x = 1;
  }
  stop('A<int>,      dynamic type check');

  start();
  for (var i = 0; i < numberOfIterations; ++i) {
    aMono.x = 1;
  }
  stop('Aint,        dynamic type check');

  start();
  for (var i = 0; i < numberOfIterations; ++i) {
    aPlain.foo();
  }
  stop('A<int>, type independent method');

  start();
  for (var i = 0; i < numberOfIterations; ++i) {
    aMono.foo();
  }
  stop('Aint,   type independent method');
}

Compiled with dart compile exe we get the following output:

A<int>,      dynamic type check: 128
Aint,        dynamic type check: 81
A<int>, type independent method: 1356
Aint,   type independent method: 1356

This shows that this particular kind of compilation makes the type check against int in Aint faster than the type check against X in A<int>. Interestingly, the method foo does not perform any operations where the static type int of x makes a difference (so we don't take advantage of the fact that myInt.toString() might be inlined, at least if we're able to determine exactly which kind of int this is at run time). This is probably because we don't recompile inherited members, even in cases where this might yield faster performance. Yet another time/space trade-off right there.

Other configurations have different timings. For example, a dart compile js compilation followed by an execution with d8 spends the same amount of time in both the A<int> and Aint case (without optimizations; -O4 skips the type checks, so that doesn't say much):

A<int>,      dynamic type check: 1762
Aint,        dynamic type check: 1757
A<int>, type independent method: 1470
Aint,   type independent method: 1475

A plain dart execution (JIT) looks like this:

A<int>,      dynamic type check: 95
Aint,        dynamic type check: 3811
A<int>, type independent method: 3668
Aint,   type independent method: 3715

The assignment in the first case might be known to have no effect, and it could then have been eliminated entirely.

I think we'd conclude (as usual when it comes to performance) that it is really tricky to keep track of the optimizations and understand in detail why we get one or the other outcome.

However, we can also note that we can perform about 50% more type checks per time unit with dart compile exe with a fixed type argument, compared to the case where the type is a type variable. So it does matter, at least if everything else lines up to allow us to test that particular thing (or perhaps I've just completely misinterpreted the situation ;-).

@mkustermann, @mraleph, just wondering, does this result surprise you?

atsansone commented 1 year ago

@eernstg : This may be a good topic for a blog post, but I don't think this deep of a dive would be helpful in the docs. Would you or someone want to consider this as a blog post?

eernstg commented 1 year ago

OK, I'll explore the idea that we create a blog post in this area.