dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.2k stars 1.57k forks source link

@Invariant proposal #38010

Open Hixie opened 5 years ago

Hixie commented 5 years ago

We use the following pattern in Flutter's localization code:

// library 1
class S { }
class A extends S { }
class B extends S { }
class C extends S { }

S createS(String name) {
  switch (name) {
    case 'a': return A();
    case 'b', 'bravo': return B();
    case 'c': return C();
  }
}

// library 2
S selectedS;

void selectS(List<String> options) {
  selectedS = createS(options.first);
}

// application
void main() {
  selectS(const <String>['b', 'delta']);
  assert(selectedS is B);
}

We would like it to be possible to tree-shake out A and C.

There's a lot of indirection in the real code from the const list to the switch.

Here's the example of the const list: https://github.com/flutter/flutter/blob/master/examples/stocks/lib/main.dart#L116

Here's the switch statement in the real code: https://github.com/flutter/flutter/blob/master/packages/flutter_localizations/lib/src/l10n/generated_material_localizations.dart#L18440

Here's one of many steps between the two: https://github.com/flutter/flutter/blob/master/packages/flutter/lib/src/widgets/app.dart#L1196

It's not clear to me how we can actually trigger this tree-shaking. We'd like to be able to do something like say to the compiler "You can assume that at the point where you reach that code over there, that variable/parameter is guaranteed to have a value equal to one of these".

So for example something like:

// library 1
class S { }
class A extends S { }
class B extends S { }
class C extends S { }

S createS(@Invariant.willBeOneOf('names') String name) {
  switch (name) {
    case 'a': return A();
    case 'b', 'bravo': return B();
    case 'c': return C();
    default: return null;
  }
}

// library 2
S selectedS;

void selectS(List<String> options) {
  selectedS = createS(options.first);
}

// application
void main() {
  selectS(@Invariant.declare('names') const <String>['b', 'delta']);
  assert(selectedS is B);
}

The idea is that the compiler would then compile the createS method down to one of the following (depending on how clever we want to be... I think I prefer the second one myself, as it allows for release builds to really benefit from this in terms of code performance as well as code size):

S createS(@Invariant.willBeOneOf('names') String name) {
  switch (name) {
    case 'a': throw OptimizerError('Invariant violated: "name" equals "a" but "a" was listed as an impossible value for "name".');
    case 'b', 'bravo': return B();
    case 'c': throw OptimizerError('Invariant violated: "name" equals "c" but "c" was listed as an impossible value for "name".');
    default: return null;
  }
}

...or:

S createS(@Invariant.willBeOneOf('names') String name) {
  assert(name == 'b' || name == 'delta', 'Invariant violated: "name" equals "$name", which is not one of the declared possible values according to @Invariant.willBeOneOf.');
  if (name == 'b')
    return B();
  return null;
}
mraleph commented 5 years ago

Note that the pattern in the actual location code is much more complex, e.g. array elements are not just strings but Locale objects and we destructure them in a switch:

Locale locale;
switch (locale.languageCode) {
  case 'af':
    // ...
}

Furthermore locale.languageCode is not a field but rather a getter with some additional logic.

I think there is an alternative way of looking at this particular problem by operating within confines of the language.

If you can change Flutter framework to ensure that Locale objects only appear as constants in the code (this is currently not true - _updateLocales constructs Locale objects dynamically), then we can add logic into AOT compiler to throw away unreachable branches from switches like:

Locale l;
switch (l.languageCode) {
  case 'xyz':
}

No additional language features required.

eernstg commented 5 years ago

If a compile-time analysis of possible values at various program points gets too complicated, we have actually discussed a way to control tree shaking which is under more direct developer control: The proposal about link time sets and maps.

You could use it in the following way:

// 'library0.dart'.
class S {}
Map<String, S Function()> factories of const;
S createS(String name) => factories[name]();

// 'library1.dart'.
import 'library2.dart';

class A extends S { }
class B extends S { }
class C extends S { }

const factories['a'] = () => A() if A;
const factories['b'] = () => B() if B;
const factories['bravo'] = () => B() if B;
const factories['c'] = () => C() if C;

// 'library2.dart'.
S selectedS;

void selectS(List<String> options) {
  selectedS = createS(options.first);
}

// application
void main() {
  selectS(const <String>['b', 'delta']);
  assert(selectedS is B);
}

factories is a link time map, that is, a map that binds constant keys to constant values according to the 'contributing declarations' like const map[...] = ... if T. It is essentially a modular specification of a constant map.

It works like this: The contributing declarations like const factories['b'] = ... if B are ignored during regular compilation and tree-shaking (which is actually an accumulation of information about which program elements are in use, so let's call it tree-accumulation). Then, if B has been included by the tree-accumulation, the operation implied by the contribution declaration is performed (so factories changes from empty to { 'b': () => B(), 'bravo': () => B() } because B is in use), and tree-accumulation is repeated, until stability has been reached. In this case that's immediately (because the added program elements do not use anything which wasn't already in use).

The classes A, B, and C could be declared in different libraries, and S, factories, and createS don't have to be in a location where A/B/C are in scope. So this decouples several parts of the system that would otherwise depend on each other.

Of course, discovering that certain cases in a switch can be eliminated based on a sophisticated analysis of the possible values of a given expression (as @mraleph mentioned) is not the same thing as deciding on whether or not to add a binding to a link time map based on the presence/absence of a given program element after tree-shaking, but we would of course have the optimizations available together with link time collections (if they are added to the language), and I think those two mechanisms might well enhance each other.

vsmenon commented 5 years ago

(General note - @mraleph @sigmundch - we should consider something like an area-optimization.)