Open eernstg opened 5 years ago
Using static
as prefix could conflict with some designs for #336.
This proposal does not preclude uses of the class which requires instantiation, like:
dynamic c = C();
print(identityHash(c));
c.foo();
For ahead-of-time compilation, this would mean that you need the instance method anyway, and if you also have a static function desugaring, you double the code size (or at least introduce code in the instance methods that forward to the static functions).
To avoid that, you might want to require that object creation expressions for the class only occurs in non-leaking positions too. That should ensure that nobody ever stores a reference to the class and accesses instance members through that reference later.
As for mutable variables, a desugaring at the VM level should be able to make a static function act directly on stack slots, and reuse the same stack slots for a number of consecutive member accesses (for cascades), and those slots could be mutable. I can see that dart2js would need to do allocation, though.
Nitpicks:
e || ...
or e && ...
to be non-leaking. The expressions must evaluate to boolean values, and a static class cannot implement bool
. The other operands of &&
and ||
(and !
) are equally non-leaking because the operators are defined to always return bool
.e <cascadeSection>+
can be a allowed as long as it itself occurs in a non-leaking position, allowing (a..foo()..bar()).baz()
, which is actually a common use-case.e
is non-leaking in (e)
if (e)
itself is non-leaking.An interesting question to me for this proposal is can it handle the following real-world need for enabling the compile-time elimination of objects?
Flutter uses package:vector_math
which has Matrix4
and Vector3
wrappers, each with a single final Float64List
field. Can most of these wrappers be reliably eliminated?
Take the call graph below MatrixTween.lerp
. There are about a dozen wrappers below that call. If I hand-edit about 20 methods in the vector_math library to introduce a layer of static methods on Float64List
s, I can force dart2js to eliminate all of the intermediate wrappers. (I didn't change MatrixTween.lerp
itself, that would be cheating). The refactored code is pretty weird looking and would be harder to maintain, so I think something like static classes would be really useful, but only if it works for examples like this.
If the scheme described above is not quite up to the job, what would make it work?
The proposal would need to handle simultaneously 'unwrapping' in the receiver, argument and result positions of a single call where there are multiple 'static class' types.
e.g. Vector3.+
has a Vector3
receiver and parameter and returns a fresh Vector3
. The unwrapped static method should take two Float64List
values and return aFloat64List
.
e.g. Matrix4.decompose
, the receiver is a Matrix4
and there are two Vector3
parameters and a Quaternion
parameter. decompose
itself creates temporary objects of type Vector3
, Matrix4
and Matrix3
(via m.getRotation
)
Using static as prefix could conflict with some designs for #336.
Right, and it conflicts with the C# notion of static classes. The core property here is actually that it is a class that does not internally use the identity of the receiver (in particular, this
is never leaked), and this means that we can establish based on a non-leaking instance creation expression that there is no need to have the actual object at all.
So maybe identityless
would be better, but also weirder. ;-)
This proposal does not preclude uses of the class which requires instantiation
That's right, but it would be trivially easy to make such usages an error, as you mention later. I just thought that it would be a useful trade-off to allow non-leaking instance creations to use the optimization (of eliminating the object), and letting all other usages have the standard semantics. This means that you will get an actual wrapper for free whenever that is required, and you'll get the optimization otherwise.
There is no need for
e || ...
ore && ...
to be non-leaking
What I'm saying is that these expressions are non-leaking, so in particular this
is allowed to occur in an expression of the form this || ...
or this && ...
in the method bodies of a static class. [Edit: deleting stuff about operators for these symbols, they don't exist.]
e <cascadeSection>+
can be a allowed as long as it itself occurs in a non-leaking position
Right, that one can be generalized. Thanks!
[Consider the]
Matrix4
andVector3
wrappers, each with a singlefinal Float64List
field. Can most of these wrappers be reliably eliminated?
To some extent, yes.
The code for those two classes seems to rely on passing instances of type Matrix4
and Vector3
around in various ways (e.g., as arguments to static methods), and this probably means that we would only be able to eliminate some instances: The notion of static classes does not in itself support elimination of objects that we are passing as arguments to (non-inlined) functions.
Matrix4
and its ilk be static classes?Considering the instance methods only, it would be possible to satisfy the static class constraints. We'd need to adjust clone
and a several other members as follows:
// Assuming that the clone must "own" its Float64List, we need to copy it.
Matrix4 clone() => Matrix4.fromFloat64List(Float64List.fromList(_m4storage));
double relativeError(Matrix4 correct) {
final Matrix4 diff = this - correct; // Changed, was `correct - this`.
final double correct_norm = correct.infinityNorm();
final double diff_norm = diff.infinityNorm();
return diff_norm / correct_norm;
}
// Write a `Matrix3._copyNormalFromFloat64List` corresponding to
// `Matrix3.copyNormalMatrix`.
Matrix3 getNormalMatrix() =>
new Matrix3.identity().._copyNormalFromFloat64List(_m4storage);
// New method.
double _copyInverseFromFloat64List(Float64List argStorage) {
final double a00 = argStorage[0]; ...
final double a33 = argStorage[15];
...
if (det == 0.0) {
_setFromFloat64List(argStorage);
return 0.0;
}
final double invDet = 1.0 / det;
_m4storage[0] = (a11 * b11 - a12 * b10 + a13 * b09) * invDet; ...
_m4storage[15] = (a20 * b03 - a21 * b01 + a22 * b00) * invDet;
return det;
}
double invert() => _copyInverseFromFloat64List(_m4storage);
void decompose() {
...
final Matrix4 m = new Matrix4.fromFloat64List(_m4storage);
...
}
List<double> applyToVector3Array(List<double> array, [int offset = 0]) {
for (int i = 0, j = offset; i < array.length; i += 3, j += 3) {
final Vector3 v = new Vector3.array(array, j).._applyFloat64List(_m4);
...
}
...
}
So it would certainly not be impossible to adjust Matrix4
such that it satisfies the static class constraint, and Matrix3
would allow for the same adjustment, but it does require several changes to the implementation.
This would allow us to eliminate objects that are created in non-leaking expressions, including the ones that occur in the classes Matrix4
etc. themselves.
However, that's just the first "row" of instances of those classes (think this), and it wouldn't help us to eliminate objects which are returned (e.g., Matrix4.absolute()
returns a Matrix4
which is created in the body of that method). So in that sense we still have another "row" of instances of Matrix4
and its ilk in the next line, when we have eliminated the first one.
It would probably be possible to walk over the code and get rid of more and more instances of Matrix4
etc, and the desugared code would then be able to take many steps working directly on the underlying Float64List
, and only creating an actual Matrix4
etc when that instance is definitely leaked.
lerp
?Considering Matrix4Tween.lerp
I can see that the 6 local variables {begin,end}{Translation,Rotation,Scale}
(that is beginTranslation
.. endScale
) are passed as actual arguments to decompose
, which means that the corresponding objects will need to be allocated, with my existing proposal.
If the scheme described above is not quite up to the job, what would make it work?
We could consider allowing formal parameters whose type is a static class to be decomposed and passed as fields. This would require a separate entry point which is only used for typed invocations, and it would require the actual parameters to have an exact type.
However, this takes us in the direction of a lot of special casing.
This kind of scenario would probably be better served if we consider the VM only, and then use an approach where the instances of static classes are allocated on the stack, but using the exact same layout for access to the state of the object as that of a heap allocated instance (omitting some heap specific parts, e.g., for the GC). It would then be possible to pass a reference to an instance which is stack-allocated, as long as the receiving function doesn't leak that object. (This is similar to borrowed objects in Rust.)
The proposal would need to handle simultaneously 'unwrapping' in the receiver, argument and result positions of a single call where there are multiple 'static class' types.
Right, and this would presumably require support for multiple return values. So that is going a step further than I would expect to be able to go in a context where we need to generate JavaScript as well as native code.
Note, however, that borrowed objects would need to exist in memory even though that's on the run-time stack, whereas objects which are eliminated as proposed in this issue can be represented entirely in registers. This means that field accesses for the borrowed objects would still need to go via a load from a memory address with an offset, so they are only cheaper insofar as they reduce the work done by the garbage collector.
Could abstract classes which could have constant constructors including int
and String
be static classes?
In response to #306, this issue proposes the introduction of static classes. This is a class modifier
static
and a compile-time constraint on each classC
which has that modifier, ensuring that certain instances ofC
can be eliminated altogether, without changing the observable behavior of the program.The computations associated with the creation of an instance
o
ofC
and running one or more instance methods witho
as the receiver will be desugared by the compiler into invocations of statically resolved functions (e.g., top-level functions), thus lowering the pressure on the memory management system and enabling various other optimizations.As seen from a developer's point of view, we can mark a class
C
asstatic
in order to indicate the intention that compilers should be able to apply that optimization, and the classC
will then have a compile-time error unless it satisfies the requirements for allowing the optimization.As an alternative, the optimization could be applied to each class which satisfies the constraints, silently. However, developers would then have more trouble detecting when a seemingly benign change to a class suddenly prevents that class from admitting this optimization. So we consider the explicit approach better: Developers will announce their intention to define a class whose instances can be eliminated in specific situations by including the modifier
static
, and future violations of the associated constraint will then be flagged explicitly as a compile-time error.A couple of examples of classes where this concept could be useful are given in #306, but the main motivation for having the concept are the following:
It should be possible to provide a guarantee that an invocation of a static extension method (cf. #41) corresponds to an invocation of a top-level function, and in particular that no objects are allocated as part of the method invocation itself.
It should be possible to provide a guarantee that no wrapper object is created when an object is accessed under a specific extension type (cf. #42), except in the case where that object is "leaked", e.g., by being passed as an argument to a function invocation whose type is a superinterface of that extension type.
In other words, the optimization which is the core point of having this proposal is crucial for extension methods and extension types, but it may be helpful in a lot of other situations in which case it is useful to be able to specify this property separately.
Syntax
The grammar is adjusted as follows in order to support this feature:
The only difference is that it is possible to add
static
as a modifier of the class or mixin.Static Analysis
The static classes are the following: The class
Object
is a static class. When the declaration of a classC
has the modifierstatic
,C
is a static class. A mixin applicationS with M
is a static class ifS
is a static class andM
is a static mixin or class.It is a compile-time error if a class
C
is static and its superclass is not static.The classes implemented by a static class or mixin may or may not be static, no special constraints are imposed for that.
It is a compile-time error if a static class has one or more mutable instance variables.
Consider an expression
e
. We say thate
occurs in a non-leaking position if it occurs in one of the following ways:e ('||' <logicalAndExpression>)+
e ('&&' <equalityExpression>)+
e <equalityOperator> <relationalExpression>
e <typeTest>
e <typeCast>
e <relationalOperator> <bitwiseOrExpression>
e ('|' <bitwiseXorExpression>)+
e ('^' <bitwiseAndExpression>)+
e ('&' <shiftExpression>)+
e (<shiftOperator> <additiveExpression>)+
e (<additiveOperator> <multiplicativeExpression>)+
e (<multiplicativeOperator> <unaryExpression>)+
<prefixOperator> e
e <selector>+
e <assignableSelectorPart>+ <assignmentOperator> <expression>
e <cascadeSection>+
, when it occurs as an<expressionStatement>
It is a compile-time error if the reserved word
this
occurs in an expression in the body of a static class, unless it occurs in a non-leaking position.This allows for
this
to occur in all the syntactic contexts where the semantics is a method invocation onthis
, plus possibly some additional steps that are language defined and known to not store a reference tothis
anywhere (including in variables of any kind, or as a parameter in a function invocation). With the cascade it is crucial that the value of the expression as a whole is discarded, which is ensured by the requirement that the cascade must be an expression statement.In short, the implementation of a static class can call methods on
this
, but it cannot leakthis
.Note that a constructor in a static class may have an initializing formal, e.g.,
C(this.x)
, but the occurrence ofthis
in such a formal parameter is not an expression, so there are no special constraints on that. Similarly, an element in the initializer list of a constructor in a static class may usethis
as inthis.x = 42
in static classes, just like other classes.Dynamic Semantics
No special rules apply for the dynamic semantics of instances of a static class.
However, consider an instance creation expression
e
which invokes a generative constructor to create an instanceo
of a classC
. In the case whereC
is static ande
occurs in a non-leaking position it is guaranteed that no reference too
is ever stored, except that the stack frame for each instance method invocation ofC
will store a binding ofthis
—but the code in the implementation of that instance method has the same constraints, so they will also not leak the value ofthis
.Because of this guarantee, it is permissible for a compiler to generate code whereby the values of the instance variables of
o
(which must be final becauseC
is static) can be copied freely and stored in activation records on the run-time stack. This means that there is no need to allocate an actual object in the heap, it can effectively be transformed into separate variables, one for each instance variable ofC
, which may be passed as actual arguments to static functions and copied freely to as many stack frames as needed, and stored as local variables, one for each instance variable ofo
.Discussion
Consider the following situation:
The class needs to exist as usual if it is used for any purpose which does not admit elimination of the instance. So the initialization of
c
causes allocation of an instance ofC
, and it has state and methods just like it would have had ifC
were not a static class.However, the cascade
C(3, 4)..foo()..bar(5)
can be optimized because it is a non-leaking occurrence of an instance creation that invokes a generative constructor. Similarly for the declaration and initialization ofi
. This could be achieved by a desugaring translation along the following lines:It should be noted that this optimization is only applicable in the situation where the exact type of the instance-which-will-be-eliminated is known, and in particular it can be determined statically which implementation each instance method has, and which getters correspond to a native storage read operation (there is no mutable state, so if there are any setters then they are just regular instance methods, with the usual constraints).
The desugared code may look like there is not much gained by this optimization, but it should be noted that invocations of instance methods from other instance methods would otherwise be hard to optimize (when
baz
callsfoo
it is not known that there is no overriding implementation offoo
, but in the desugared code it is known that the exact type of the "receiver which isn't there" isC
, so we can callfoo
statically, and might inline it). It should also be noted that the parameters of the desugared methods may be allocated in registers whereas the instance variables of an instance ofC
would have to be stored in the heap.