Open eernstg opened 5 years ago
I think this should apply to the builder pattern as used by built_collection
and built_value
. It would be great to reduce the overhead.
void foo = Foo((b) => b..x = 3..y = 4);
where the Foo
factory creates a FooBuilder
, applies the function to it, then converts it into a Foo
to return.
That would be great! Please check out #308 and speak up if you see something that you think wouldn't work. ;-)
Can you elaborate a little bit more on the motivations here? In particular, what is it that you want to enable that is not possible now? A quick check of dart2js output shows that it will eliminate all but one of the object allocations in your example above, and I would be fairly surprised if the VM doesn't do likewise.
Test program:
import 'dart:core';
void main() {
var now = DateTime.now();
var berlinWallFell = DateTime.utc(1989, 11, 9);
var moonLanding = DateTime.parse("1969-07-20 20:18:04Z");
print(now.isAfter(berlinWallFell));
print(berlinWallFell.isAfter(moonLanding));
var fullString = (StringBuffer()
..write("Use a StringBuffer")
..writeAll(["for ", "efficient ", "string ", "creation "])
..write("if you are ")
..write("building lots of strings"))
.toString();
print(fullString);
}
Output when compiled with dart2js:
A = {
main: function() {
var moonLanding,
t1 = Date.now(),
t2 = H.Primitives_valueFromDecomposedDate(1989, 11, 9, 0, 0, 0, 0, true);
if (typeof t2 !== "number" || Math.floor(t2) !== t2)
H.throwExpression(H.argumentErrorValue(t2));
moonLanding = P.DateTime_parse("1969-07-20 20:18:04Z");
P.print(t1 > t2);
P.print(t2 > moonLanding._value);
t1 = P.StringBuffer__writeAll("Use a StringBuffer", ["for ", "efficient ", "string ", "creation "], "") + "if you are building lots of strings";
P.print(t1.charCodeAt(0) == 0 ? t1 : t1);
}
};
As a language feature "the compiler should generate better code if able" seems really vague.
I'd love to allow users to introduce more confidence they are writing explicitly writing optimizable code (whether support constant expressions/functions, something like Kotlin's inline
classes, etc).
Can you elaborate a little bit more on the motivations
I can see how these optimization opportunities can sometimes be detected today, but the point is that the developer who marks a class as static
commits to a certain discipline in the implementation of that class, and any violations will cause a compile-time error.
So this marks the class as "intended to be able to evaporate", and hence certain constructor invocations can be eliminated: The object was never created, and its state is stored in local variables (including function parameters).
We're essentially storing the object on the stack, and we're free to create as many copies as we want, because the special discipline provides the guarantee that this does not create any difference in the observable behavior.
If that is really, seriously no problem to achieve without help then we can just eliminate the notion of static classes from the proposal about how to obtain a generalized version of static extension methods and static extension types in #309. That would just make it simpler. (Note that this amounts to the same thing as saying that we can create wrapper objects to handle extension method invocations at zero cost, compared to a compilation strategy which always just translates the extension methods to static functions.)
Otherwise, I believe that the notion of static classes does embody a robust and rather expressive static check on a class which ensures that "non-leaking" (cf. #308) expressions are guaranteed to be transformable into purely static function calls. The core idea is that this
never leaks, and there is no mutable state, so nobody will need a reference to the object and hence nobody can see whether there are multiple copies.
The only kind of access to the object is member lookup, and that's always a two-step traversal where we have a reference to the object in the middle (to find o.m
we evaluate o
and then m
relative to o
), but that's opaque so for a variable access we can replace it by a lookup of a local variable/parameter in a static function, and for an instance member invocation we can replace it by a call to a static function.
I looked a little bit more closely at the translated code, and several things came to mind.
My main motivating scenario for having static classes is that they allow developers to ensure that we can (1) specify certain mechanisms (extension classes and extension types, in particular as generalized in #309) in terms of normal objects created as instances of normal classes (including all the million features that they have), and then (2) obtain a zero cost implementation of those features, because those objects would be optimized away completely. (In terms of #309, developers can also make a different trade-off: If they don't specify that the receiver/wrapper class is static
then they get to use wrapper objects that are unconstrained, so they can have mutable state and they can leak this
as much as they want; but then we can't promise to optimize them away).
With the DateTime
examples, I was actually relying on the fact that the newly created DateTime
objects are never used, so an "unused variable" analysis would allow us to detect that the reference to the given instances of DateTime
are never "leaked", so they don't have to be allocated at all. We would need to get the side effects associated with the constructor execution, but the objects themselves would not have to be allocated (if we can faithfully get those side effects without having the object, of course). This situation matches this request, but it doesn't match the notion of static classes (#309) because static classes don't involve anything like an "unused variable" analysis. I should have explained that more explicitly—sorry about that.
Checking out the generated code, I can see that Date.now()
redirects to an external constructor, so we will get that instance. The invocation of Date.now()
still exists when I remove the usages (the two print
statements in lines 4-5 of main
). So there is no attempt to get the side effects without allocating the object, presumably because the static analysis can't see how to do that, or doesn't try.
Primitives_valueFromDecomposedDate
comes from the static function Primitives.valueFromDecomposedDate
in 'js_helper.dart'. There are at least 6 authors of the code in there, so this is different from having a compiler generated static function, and it is presumably used because the compiler has special logic for DateTime
with the given usage. However, that static function is actually just doing some validation and then allocating a DateTime
object, so we're not avoiding the allocation after all! This is of course different from the scenario I'm aiming at, where the optimization is applicable for all static classes, not just the ones that the compiler knows about, and it allows certain allocations to be avoided altogether.
StringBuffer__writeAll
is obviously special-casing (so the compiler knows how a StringBuffer
works; otherwise it would of course not be appropriate to rewrite ..write("if you are ")..write("building lots of strings")
on the StringBuffer
into + "if you are building lots of strings"
on a String
). So again we have a high degree of special casing (StringBuffer.writeAll
is actually external in 'core/string_buffer.dart'). In this particular case we do actually avoid allocating the string buffer, but that's because we have that kind of special casing.
So I basically remain unconvinced that we already handle the situations where I think object allocations can be avoided, at least not at anything like the same level of generality and predictability that we could have if we were to allow classes to have the modifier static
and treat them as in #309.
I adjusted the examples using DateTime
to match the main scenario for a static class directly:
main() {
var currentMonth = DateTime.now().month;
var moonLandingDay = DateTime.parse("1969-07-20 20:18:04Z").weekday;
print("Current month: $currentMonth, Moon landing day: $moonLandingDay");
}
This translates as follows:
...
main: function() {
var t1 = Date.now(),
t2 = P.DateTime_parse("1969-07-20 20:18:04Z");
H.printString(
"Current month: " +
H.Primitives_getMonth(new P.DateTime(t1, false)) +
", Moon landing day: " + H.Primitives_getWeekday(t2));
}
...
So we're still allocating two DateTime
instances, and this is one of those situations where we might be able to make DateTime
a static class, in which case we would be able to eliminate the two allocations.
PS: Interestingly, P.DateTime_parse
will create one DateTime
and new P.DateTime
will create another one, and the generated code has swapped the order. So the compiler must know that there are no side effects in the constructor. ;-)
@davidmorgan wrote:
void foo = Foo((b) => b..x = 3..y = 4);
.. theFoo
factory creates aFooBuilder
, applies the function to it, then converts it into aFoo
to return.
Wouldn't that be var foo = ...
or Foo foo = ...
? I'll assume that.
Interestingly, we could actually use an extension type (that is, a wrapper class
, in #309 lingo) for this:
wrapper class Foo ... {
implicit factory Foo(void Function(FooBuilder) f) {
// We'd do something like this:
FooBuilder fb = ...;
f(fb); // Let the user code set up `fb`.
return fb.build(); // Create and return the corresponding `Foo`.
}
}
main() {
Foo foo = (b) => b..x = 3..y = 4;
}
It might seem counter-intuitive to let Foo
be a wrapper class for FooBuilder
, but this just means that any situation where a void Function(FooBuilder)
is provided and a Foo
is expected, the constructor will be implicitly inserted such that we obtain the desired Foo
. Here's the desugared code:
// Desugared code (`class Foo` is unchanged).
main() {
Foo foo = Foo((b) => b..x = 3..y = 4);
}
By the way, if we use the abbreviated function syntax of #265 we can write it like this:
main() {
Foo foo = { x = 3; y = 4; };
}
This is the same thing, and it desugars the same way, it's just relying on the ability to have a parameter named this
and the ability to use the abbreviated form { <statements>+ }
to mean (this) { <statements>+ }
. Then the access to x
becomes this.x
(provided that there's no other x
in the lexical scope), etc.
We rely on type inference to tie the pieces together, because it is only a type correct constructor invocation when inference has decided that the parameter type of the function literal is FooBuilder
.
But given that we've said that we want a Foo
, there is only one implicit constructor, so there is not that much of an ambiguity. Of course, if anyone else will convert a void Function(T)
to a Foo
, for some T
!= FooBuilder
, it would be ambiguous; but it could then be written as (FooBuilder b) => b..x = 3..y = 4
, respectively (FooBuilder this) { x = 3; y = 4; }
.
However, all this doesn't help much along the way to eliminate any allocations. You wouldn't be in a good position to eliminate the FooBuilder
(if it's used in a way which is similar to the one that I hinted at above), because fb
is passed to the user-provided function object f
, and this means that we must pass an instance of FooBuilder
, so we must allocate it. Also, a user-provided function object is not likely to be inlined (you'd need global flow analysis and a lot of method splitting or something weird like that), so it isn't very likely that we can maneuver ourselves into a better position in that way.
So this is sad, but true—unless the whole thing works differently than I think, it won't be such an obvious use case for static classes after all.
But Foo foo = { x = 3; y = 4; };
would be kind of cool. ;-)
I think modifying the class places the decision about whether an instance should be stack-allocated in the wrong place. Wouldn't it be up to the call site to determine that? Example:
var foo = &Foo(); // dunno if & is Darty enough
foo.blah();
return bar(foo);
Knowing that the value of foo
lives on the stack allows the compiler to do more checking. For example, it must ensure that method blah
and function bar
do not create references from the heap to foo
. In Rust, they use borrowing for that. In Dart it could look something like this:
static Foo _globalFoo;
// The `&` declares that `bar` can take heap- and stack-allocated
// instances of `Foo`. It promises that it will not leak the instance to the
// heap. A static compiler check upholds the promise.
int bar(Foo& foo) {
// _globalFoo = foo; // STATIC COMPILER ERROR
return foo.someField; // OK
}
Similar syntax would need to be added for methods, but I can't think of anything Darty. You'd need some way to declare that this
is borrowed.
Actually, static class
pretty much is the syntax I'm looking for that declares that this
is borrowed. Let me read the proposal before I say more :)
I think we wouldn't want exactly the same amount of complexity in Dart as in Rust, in order to control memory management.
The idea behind a static class is that it never leaks this
, which means that creation of a fresh object in an expression that is non-leaking is also known to not leak that object. If moreover it is immutable then we can eliminate the object and just carry the values of its fields with us wherever we wish to access it. This is not as powerful as Rust, but it should be relatively easy to manage as a developer: Mark the class static, make the compiler stop complaining about the class, and then all "non-leaking creations" of (exactly!) that type of object can be eliminated (and those creations could be explicit or implicit).
I think we wouldn't want exactly the same amount of complexity in Dart as in Rust, in order to control memory management.
I think the two sources of complexity in Rust are:
Note that in my suggestion the Foo&
is fully compatible with heap-allocated objects. This means that only the author of the function bar
opts into the constraints of stack-allocation. The users of this function can safely continue using bar
in complete ignorance.
Where would something like this be useful to me? In UI frameworks you frequently create temporary data structures to compute the minimal delta you want to apply to the UI after the developer told you that something's changed. For example, you may receive two lists (e.g. of widgets) and you want to figure out the minimum amount of edits (add, remove, swap), or you receive a new tree of transform matrices and clips and you want to compute that effective change in the clip. Today we create temporary lists, maps, strings, matrices, rectangles, and other geometric objects on the heap in cases where stack would suffice. Another example is immediately-invoked callbacks, such as list.forEach
. There's no reason to allocate the callback on the heap because it is invoked and dropped right there and then. It would be great if Iterable.forEach declared this at the interface level, e.g.:
void forEach(borrow void f(E element));
// ^
// guarantees that all implementations do not leak `f`
@yjbanov wrote:
Note that in my suggestion the
Foo&
is fully compatible with heap-allocated objects
It's certainly an interesting idea, and more powerful than static classes in many ways! But if we allow any references to a stack allocated Foo
to exist at all (and we're talking Dart and not Rust ;-) then it is going to be a delicate matter to provide the necessary compile-time guarantee that there will not be any dangling pointers to that object after the return that removes the stack frame where it lives. For instance, think about passing such a pointer as an argument to a function of type Function
in a method of a mixin which happens to override the method implementation that declares a parameter to have type Foo&
. We'd certainly have to enforce that parameter type Foo&
cannot be overridden by parameter type T
for any T
(subtype of Foo
, with covariant, or supertype of Foo
) unless T
is also of the form T&
, because there's nothing other than the method signature and the override rules which will allow us to enforce additional constraints on that Foo&
argument. So it's certainly going to be a feature that developers will need to deal with in order to avoid compile-time errors, and it must be a reified part of the type of a function. Moreover, developers will need to understand the implications of having/not-having that &
when they declare a type, anywhere.
I don't think it's acceptable for Dart programs to have an even remote risk of seg-faults (that must be a bug in the runtime or in code generation, it can't be caused by a buggy program .. ok, ffi is an exception, but otherwise not), so we will need to maintain that "no dangling pointers" guarantee rigorously, and "the rest of Dart" may not be optimized for establishing such guarantees.
What I'm aiming at with static classes is a bit less ambitious. It relies on a stronger discipline whereby it is guaranteed that 100% of the references to a given object can be eliminated, and then we just need to preserve information about which values the would-be object would have in its (final) fields, and which method implementations it would run on a given instance method invocation. My idea is that the methods would be top-level functions, and the object fields would be passed around as parameters of those functions. With that, there's no need to maintain the same memory layout (or even ordering-in-memory) of the variables/parameters which are used to hold the object state, because all accesses to them will be compiled in a context where it is already statically known that they are parameters. Since parameters can be stored in registers, we could then end up having an object that "lives in a few registers" throughout its entire lifetime, so it really never has an address at all.
@eernstg
But if we allow any references to a stack allocated Foo to exist at all (and we're talking Dart and not Rust ;-) then it is going to be a delicate matter to provide the necessary compile-time guarantee that there will not be any dangling pointers to that object after the return that removes the stack frame where it lives.
That pretty much is the feature request :) The only references to stack allocated objects that would be allowed are ones marked as "borrowed". The language/compiler would have to learn how to make sure that those references are not leaked onto the heap.
The interesting question is: what's the minimum sufficient feature set? For example, would we need to support putting borrowed references into borrowed objects, such as class fields and list elements? It is a perfectly safe thing to do (segfault-wise) if the contents are guaranteed to outlive the containing object, but I'm not sure what the simplest way to express it is. Would we need something like Rust's lifetime parameters?
I don't think it's acceptable for Dart programs to have an even remote risk of seg-faults
+1
@yjbanov wrote:
The only references to stack allocated objects that would be allowed are ones marked as "borrowed". The language/compiler would have to learn how to make sure that those references are not leaked onto the heap.
We might be able to handle such a concept. However, I'm pretty sure that Rust has been optimized for that kind of static proof maintenance almost from day 1, whereas Dart embodies a thousand design decisions that are not optimized in that particular direction, and we might very well have to transform Dart into a kind of a half-Rust-ish language in order to do it. This could mean that every Dart developer would then have to spend a significant amount of brain cycles on getting those "borrowed" bits right.
One example is that Dart would presumably need to maintain the stack-allocation related invariants dynamically. We can't prevent dynamic invocations like d.x
from evaluating to a stack-allocated object (if an instance variable can have a 'borrowed' type), and we'd then need to ensure (1) that this evaluation always fails, or (2) that we only do things with that object which will maintain the required discipline. Borrowed types would presumably be supertypes of normal (heap-allocated) types (because we can do more things with entities of a normal type), and this could mean that we will need to downcast a lot of expressions from a borrowed to a normal type, or we would need to make the default bound of a type variable a normal (non-borrowed) type, and then require explicit bounds in the cases where a type variable should be allowed to have a value which is a borrowed type. Etc. ;-)
So I can certainly see the temptation to get the added performance, but I do think that the notion of borrowed objects is so deep that it will have very broad implications for a language to support it, and for developers using it.
What I've proposed in terms of 'static classes' (#308) is a much simpler feature: It allows for object allocations to be eliminated in a situation where we have an exact type of an immutable object, and a complete absence of leaking of said object.
This could mean that every Dart developer would then have to spend a significant amount of brain cycles on getting those "borrowed" bits right.
This is a very important point. We should make it a design constraint that this does not happen. I think this can be designed in such a way that does not leak the complexity of borrowing onto developers. For example, this is why I suggested that borrowing functions also allow passing normal heap-allocated objects. What you can do with a borrowed reference should be a strict subset of what you can do with a heap reference. This way the author (and only the author) of the borrowing function opts into this complexity. The author does not opt the users into it as well.
We can't prevent dynamic invocations like d.x from evaluating to a stack-allocated object
Why not? It seems pretty easy to forbid assigning a borrowed Foo
to dynamic
. However, I do not insist that arbitrary Dart objects should be allowed to participate in this. It is totally OK to only support a subset as long as the feature is still applicable enough to solve a lot of performance problems.
What I've proposed in terms of 'static classes' (#308) is a much simpler feature
I actually do not see a conflict here. If #308 is useful enough, then by all means, it should be added to the language too.
Gonna be honest, this sounds a lot like structs... You have "objects" that are just wrappers over plain ol data, and these objects are passed by "copy" by destructuring the inner fields.
We can't prevent dynamic invocations like d.x from evaluating to a stack-allocated object
@yjbanov wrote:
Why not? It seems pretty easy to forbid assigning a
borrowed Foo
todynamic
That's not where it happens: Consider the situation where we assign an expression of type T
to dynamic d
, and the dynamic type of the assigned value is some T1 <: T
, and T1
is a class that declares a getter x
whose return type is borrowed Foo
.
There may not be a getter T.x
, so there is no way we can make d = e
an error when e
has static type T
, because we can't even predict that there will be a d.x
at run time, and we have no idea whether it would have a return type which is borrowed
, and we don't actually know whether we'll ever use the x
of d
in the first place.
I do not insist that arbitrary Dart objects should be allowed
The problem is that T
above can be any type that can have a subtype T1
that declares any member where a borrowed
type occurs covariantly (e.g., as the return type of a getter or method, or the type of an instance variable). Basically, T
can be anything. And we can't make it an error to do dynamic d = e
whenever the static type of e
is "anything". So we're not just excluding a small subset of the Dart objects, we're excluding basically all expressions (except types like int
, double
, and String
, where we do know all subtypes).
@ds84182 wrote:
this sounds a lot like structs
True, but structs (meaning record data types with direct allocation, as opposed to heap allocated entities that are accessed via a pointer) require support from the runtime. So we can't have structs if we translate to JavaScript. But we can compile non-leaking usages of instances of static classes down to a form where that instance is eliminated, also when we translate to JavaScript (or anything whatsoever, because the elimination of that instance is a Dart-level desugaring operation).
@eernstg
Consider the situation where we assign an expression of type
T
todynamic d
, and the dynamic type of the assigned value is someT1 <: T
, andT1
is a class that declares a getterx
whose return type isborrowed Foo
.
I'm not sure we need borrowed return types. I think these could be disallowed. Alternatively, a borrowed return type may only be assigned to a borrowed variable iff the type of the wrapper object is itself borrowed (not sure if we'd also need lifetimes). Not sure how useful this feature is though. For Rust, this is critical, because there's no GC to fall back on. In Dart there's always GC as the last resort.
What would be useful, however, is an ability to construct and return an object along with ownership to the caller (a la C++ move semantics). Again, regular code can be completely oblivious to the fact that ownership is given to them. They can do whatever they want with it (probably leak it onto the heap, and the runtime would happily oblige). However, advanced users should be able to put it on stack and then pass it (lend it?) to functions that borrow the object, avoiding heap allocation.
@yjbanov wrote:
I'm not sure we need borrowed return types
That's an interesting restriction!
We could consider the following rule: A formal parameter p of a function declaration can be given a borrowed type, in which case the body of the function can only contain p in non-leaking positions. This would ensure that we don't need borrowed return types because no expression with a borrowed type can be returned.
But unless we restrict this discussion to a setting where we can manage memory ourselves (in particular, not when generating JavaScript), we would need to generate a specialized version of every such function which "accepts the state" of p rather than a regular object (that is, it accepts several formal parameters rather than p), and this would have implications for tear-offs and dynamic invocations etc. And we'd still need to have function types whose parameter types can be borrowed (as well as all other contravariantly positioned types, e.g., the parameter types of a returned function type). So it would still incur a non-trivial language complexity cost.
Considering borrowed types, here is a description of the kind of red tape that Rust programmers get into if they want to assign one reference to another one: Medium: 'Mutable Reference in Rust'. I think we'd want to be very cautious about forcing Dart programmers into a situation where they need to solve that kind of puzzles.
@eernstg
forcing Dart programmers
I'm not sure how this can be described this way, given that you can continue using the existing language as you do today. The idea is to offer this as an advanced language feature that never "leaks" across API boundaries. That is, non-advanced users should never be exposed to this. The issue with Rust (and other non-GC non-RC languages, like C, C++) is that the language allows this complexity leak out to public API.
Consider, for example, the amount of bit-twiddling complexity that goes on in the Dart VM to make strings and ints fast. Does that complexity leak onto the Dart programmer? No. The API looks very natural and "Darty" to use. That should also be a constraint on the feature I'm proposing.
Is it possible in dart to have non-paused GC like in golang?
The GC in Dart is concurrent and mostly non-pausing as far as I know.
The GC in Dart is concurrent and mostly non-pausing as far as I know.
Thanks @munificent. Will there be a time in coming years when Dart will be modified to have same performance level(in terms of memory footprint, GC optimization, Lighter isolates etc) as Golang or more performant server side languages. I think that will be a crowd puller to develop the open source server side ecosystem libraries like load balancer, microservice registry, rate limiter etc. Please ignore if my question seems silly.
Will there be a time in coming years when Dart will be modified to have same performance level(in terms of memory footprint, GC optimization, Lighter isolates etc) as Golang or more performant server side languages.
This question assumes a premise that it's not already at the same performance as those languages, but that premise requires some sort of agreed-on benchmarks or performance data, and I don't think that consensus exists. There's no context-free way to say language X is faster than Y. Instead it's language X is faster than Y for assumed-equivalent programs Zx and Zy.
I think for many programs, Dart is plenty fast enough to be used as a back-end language. (For example, all of our own tools are written in Dart, including the compilers, package manager (both client and server), etc.)
Will there be a time in coming years when Dart will be modified to have same performance level(in terms of memory footprint, GC optimization, Lighter isolates etc) as Golang or more performant server side languages.
This question assumes a premise that it's not already at the same performance as those languages, but that premise requires some sort of agreed-on benchmarks or performance data, and I don't think that consensus exists. There's no context-free way to say language X is faster than Y. Instead it's language X is faster than Y for assumed-equivalent programs Zx and Zy.
I think for many programs, Dart is plenty fast enough to be used as a back-end language. (For example, all of our own tools are written in Dart, including the compilers, package manager (both client and server), etc.)
Are the benchmarks data in following observations qualify for the consensus - https://programming-language-benchmarks.vercel.app/dart-vs-go and https://www.techempower.com/benchmarks/ (I remember seeing Shelf/aqueduct in this but not able to find it anymore. They mention in the git - https://github.com/TechEmpower/FrameworkBenchmarks that they have benchmarked Dart as well - https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Dart/dart2/server.dart). Also once Serverpod will release their throughput benchmarks(https://github.com/serverpod/serverpod/discussions/598) probably that will give a better insight and a yardstick to get behind. Thanks for taking out time to reply.
[Edit: Adjusted
DateTime
examples to create a situation which is directly relevant for a 'static class'.]There are some situations where an object is created in order to hold some data, and some instance methods are invoked on that object in order to compute results or perform actions based on that data. But the object itself is not needed for any other purpose than holding the given data during those method invocations. For instance:
(Examples derived from the DateTime DartDoc and an example in Seth's blog entry on strings in Dart).
If such objects are indeed never needed at all (except as storage locations for the data included in the construction) then we could allow an optimization whereby the allocation of a new object is eliminated, the data is stored in the run-time stack, and the computations are performed in a top-level function.