Value fields - Githubissues

dart-lang / language

Design of the Dart language

Other

2.65k stars 202 forks source link

Value fields #3332

Open leafpetersen opened 1 year ago

leafpetersen commented 1 year ago

Field reads cannot in general be promoted based on null checks or instance checks because stability of the result cannot be guaranteed. Older overview summary here. Various proposals have been floated to resolve this based on either providing a way for APIs to opt into promotion (e.g. stable getters) or by providing better ways to bind a local variable (search for the label field-promotion for some discussion issues).

This issue sketches out an approach based on adding a new modifier on fields which enforces that the field has value semantics (in the sense that a read of the field is guaranteed to be idempotent). This is enforced by disallowing overriding a value field with a getter.

To enforce this, we add a new member modifier val which can be used to modify a field in place of final.

class A {
   val int x;
   A(this.x);
}

A val field has the same semantics as a final field, with the following modification:

It is a static error to override or implement a val field with anything other than another val field.
It is a static error to override or implement a val field with a noSuchMethod forwarder.

As with final fields, a val field may be declared late.

A val field may implement/override a final field or a getter.

A read of a val field is subject to promotion.

cc @dart-lang/language-team

jakemac53 commented 1 year ago

Possibly obvious, but I do want to point out that these value fields would also effectively not be mockable. In general they shouldn't be mocked anyways, so that is a bonus in my book. But it could make migrating to a value field harder for certain use cases. Or some people might not use them because of it.

jakemac53 commented 1 year ago

Are we sure that val should imply final? I understand that we would still only get promotion for final val fields, but there might be some value in the ability to enforce no overridden setter/getter, for non-final fields.

leafpetersen commented 1 year ago

Possibly obvious, but I do want to point out that these value fields would also effectively not be mockable.

This is also true of stable getters, by the way. But yes, this is a concern. They are actually mockable if you can generate a field in the mock with an initializer or a constructor, but you cannot programmatically set the value. But I think this is an essential component of any proposal in this space. If you can programmatically mock a property you by definition can't soundly promote it.

Are we sure that val should imply final? I understand that we would still only get promotion for final val fields, but there might be some value in the ability to enforce no overridden setter/getter, for non-final fields.

Interesting. My initial take is that I'd hate for people to be writing final val SomeType. final is already kinda verbose. Do we have any examples of good use cases? It feels a bit stretched to me.

jakemac53 commented 1 year ago

Do we have any examples of good use cases? It feels a bit stretched to me.

I think for any data class like object it would be a nice property to enforce, as it enforces a simpler mental model. You always know exactly what a call to the getter/setter of the fields are doing, just a simple assignment or read. And not all of these objects are immutable (although that is a common practice).

I do also wonder if it could make the implementations better, although today I assume it's fairly easy to discover fields that never have a custom getter/setter and optimize them. If that does result in any performance or code size benefit, it would give users a way of enforcing that they always get those benefits.

Interesting. My initial take is that I'd hate for people to be writing final val SomeType. final is already kinda verbose. Do we have any examples of good use cases? It feels a bit stretched to me.

Definitely agreed, just val is much nicer than final val.

leafpetersen commented 1 year ago

It's worth mentioning that I don't actually think that overriding is really what users mostly want here. I suspect that a better (but possibly much more complex to design, use, specify and implement) variant of this would be one in which the field is non-virtual (but implementable?), but the initializer is virtual. That is, I suspect that most uses of overriding fields are not because users want to have two different slots in the object, but rather because users want to change the initialization of the field in some way. So perhaps a different take on this would be to forbid overriding of the field, but to provide a way to override the initialization of the field. It's not immediately clear what the right syntax for overriding the initializer is though. What do you do about this parameters? And what about initializer lists? And what do accesses through super do?

leafpetersen commented 1 year ago

think for any data class like object it would be a nice property to enforce, as it enforces a simpler mental model.

This is fair. In terms of syntax, I guess var int x is available. You'd always have to specify the type though.

Maybe go the other way: mutable val to make a mutable semi-virtual field?

jakemac53 commented 1 year ago

It's worth mentioning that I don't actually think that overriding is really what users mostly want here.

100%, it is honestly surprising to me that we allow overriding fields at all, it is super weird imo, and most likely it's just a footgun. I can't imagine any circumstance in which I would approve any code that does this. Put it on my list of Dart 4 breaking changes, to disallow it entirely (I assume it is allowed for field vs getter/setter symmetry reasons, but I think that is a mistake, we constantly make compromises to maintain that, and derive little benefit imo).

Honestly, we should probably have a lint for this.

I suspect that a better (but possibly much more complex to design, use, specify and implement) variant of this would be one in which the field is non-virtual (but implementable?), but the initializer is virtual

I don't think it is necessary to support the virtual initializer. If a class wants its subtypes to be able to customize the initial value of a field, it should expose a constructor parameter to do so. If anybody really wants to support this pattern of changing the initial value by overriding the old field with a totally new field, well they can use regular fields and not get promotion. I don't see an issue with that. I highly doubt it is common, and would be pretty horrified to learn otherwise :).

lrhn commented 1 year ago

100%, it is honestly surprising to me that we allow overriding fields at all, it is super weird imo, and most likely it's just a footgun.

It's true that the getter/field symmetry is mainly about different implementations of the same interface using different strategies, or the same class changing strategy over time, not so much that the same class hierarchy overrides a field with a getter. (Although, overriding a getter with a field seems fine.)

It does happen, to add some side effect and then forward to the super setter (usually) or getter. It's not shadowing as much as it's elaborating on top of the existing field. Or adding a throwing setter to hide the mutable field (hopefully rare, it's better to have no setter at all then.)

This comes back to my arguments for wanting getters over fields. It's about modelling state. A getter has no state, a field does. Having two states for the same name is what's weird, even from a state modelling perspective.

So it's not about overriding fields being weird, as much as it's overriding fields with fields. That is weird. Overriding with a getter or setter that does something more is perfectly reasonable (IMO).

Maybe because I don't assign any external semantic meaning to something being a field. It's an implementation choice for how to implement a getter, and a design choice about the object's internal state. Both are internal choices that do not affect the API.

But that's also what makes it harder to prohibit someone in another library from overriding a field with a field. They shouldn't know that they are overriding a field. And if they aren't today, it shouldn't become a breaking change for the superclass to change their get-getter to an implicit field getter. Instead they should worry that they're hiding something that the superclass exposes, making some of its potential internal state unnecessary, however it's implemented. It's a design and modelling issue, not an implementation issue.

(I assume it is allowed for field vs getter/setter symmetry reasons, but I think that is a mistake, we constantly make compromises to maintain that, and derive little benefit imo).

I disagree that we derive little benefit. If we didn't have that, I promise you that every class I wrote would use the private-field/public-getter pattern for every exposed field. And I'd not be alone in that. The Java lesson is real, if we make "fields" special, and limited, then the prudent approach is to never expose one in API you might, ever, hypothetically, want to change to be a getter. That, or make the class final, which is also prudent.

jakemac53 commented 1 year ago

Sorry I wasn't super clear, I specifically meant that overriding fields with fields is the likely footgun. I think that overriding a field with a getter/setter is perfectly fine (and normal), as long as that eventually calls the super getter/setter.

The Java lesson is real, if we make "fields" special, and limited, then the prudent approach is to never expose one in API you might, ever, hypothetically, want to change to be a getter.

If it's only overriding fields with fields that we can agree is "bad" - then you can always change a field to a getter still. But the other way around would potentially be breaking. The "simple" field pattern we have today is not lost though, there is no need to do the private field + public getter.

eernstg commented 1 year ago

I'd love to start using val to express the API level property that a given getter returns the same value if it's invoked multiple times. It's just like starting afresh with the decision to call it final (and then regretting for a decade that it's 5 characters rather than 3), and getting it right this time (with respect to the guarantees actually delivered to clients).

So that's great! ;-)

@jakemac53 wrote:

value fields would also effectively not be mockable

I think it's possible to deal with them in a mocking context: Lots of mock classes contain generated code today, so let's consider a situation where some amount of code generation can be part of the solution.

If you want to mock val T x; such that it has the value v then you can generate code in the mock class such that it has a declaration of the form late val T x = e;, and then you make sure that e evaluates to v at the time where you invoke the corresponding getter x for the first time.

For example, e could be a library variable of type dynamic, and then you could use when and then invocations to assign the desired value to e, and then evaluate that getter.

This means that we can declare a member in the mock which is stable (if we hadn't been able to do that then the mock couldn't even have the desired type). Still, we can determine per-object and rather late which value it should return.

@leafpetersen wrote:

also true of stable getters

The same technique would be applicable with stable getters.

@lrhn wrote:

The Java lesson is real

I'd also assume that the Java community would have stopped saying "don't use public fields, declare a getter and a setter!" many years ago, if it had been such an unnecessary and useless rule.

So I'd very much support the idea that it should be possible to override a val T v; by a getter that returns a stable expression, which could for instance be super.v, but it could also be a constant or any of the other stable expressions.

lrhn commented 1 year ago

About mocking value fields, we can define that the noSuchMethod-forwarder for such a member has the form:

late val type name = this.noSuchMethod(Invocation.getter(#name));

A late val variable should be possible, with and without initializer, the latter meaning it has a setter which can be called once, just like a late final variable. It really is exactly like a final variable with the extra restriction that it cannot be overridden with something non-stable. A val variable only ever has one value, and if it returns a value, it will return the same value every time in the future, making caching safe. A late val variable may throw until it gets that value.

About Java, Dart has one difference from Java's field/getter-method in that it's possible to change a stable field to a non-stable getter without all uses needing to change. It's only necessary to change the calling code if it relied on stability for promotion, and if it didn't, the syntax keeps working. Java requires you to change a .foo to .getFoo() to make the migation, which makes the migration much harder. (C# has managed to put itself in the middle, but because of capitalization tradition, you'd still have to change .field to .Field.)

munificent commented 1 year ago

100%, it is honestly surprising to me that we allow overriding fields at all, it is super weird imo, and most likely it's just a footgun. ... Honestly, we should probably have a lint for this.

https://dart.dev/tools/linter-rules/overridden_fields

But that's also what makes it harder to prohibit someone in another library from overriding a field with a field. They shouldn't know that they are overriding a field. And if they aren't today, it shouldn't become a breaking change for the superclass to change their get-getter to an implicit field getter.

I think this is a really good insight.

It's handy today that superclass authors can change getters to fields without that necessitating a breaking API change. At the same time, I suspect that when a base class changes a getter to a field, it is likely also changing the constructor signature too, so there probably is a breaking change anyway.

Maybe we're looking at this the wrong way: The concern here is that changing a getter to a field in a subclass shouldn't be a breaking API change. And if it's an error to override a field with a field, then that would be a breaking API change. But only if the base class getter/field can be overridden in the first place. How often does the superclass author even want to allow it to be overridden in the first place?

We pay such a large price for all instance members always being polymorphic. There is a lot of value in the flexibility it enables, but it feels very much like dynamic typing to me: We pay the cost of it everywhere even though it's actually used in relatively few places.

I wonder if that's the real part of the language we should be looking at. If a class author could declare a field/getter/method as non-overridable, then at that point it doesn't matter whether they change it from one kind to another.

(C# has managed to put itself in the middle, but because of capitalization tradition, you'd still have to change .field to .Field.)

Not just capitalization. The difference between a property and a field is visible in the CLR bytecode, so while changing a field to a property or vice versa isn't a syntax breaking change (assuming you keep the same capitalization), it's a binary breaking change if users are using your pre-compiled .NET assembly. They will have to recompile their code against your new API.

jakemac53 commented 1 year ago

https://dart.dev/tools/linter-rules/overridden_fields

Isn't the very existence of this lint, in the recommended set, a pretty good indication that in the real world people aren't actually overriding fields?

I checked internally and all the ignores of this lint are imo just unambiguously bad. They are overriding things when they should instead be passing a value to the super constructor, or possibly just doing an assignment in the constructor if the super constructor really doesn't want to support it (probably the case for this flutter test here).

munificent commented 1 year ago

https://dart.dev/tools/linter-rules/overridden_fields

Isn't the very existence of this lint, in the recommended set, a pretty good indication that in the real world people aren't actually overriding fields?

Yes, good point!

lrhn commented 1 year ago

The problems with using restrictions, like final classes or non-overridable members, to enable other features, like promotion or just being able to make changes without breaking other people's code, is that it's a trade-off.

At least for package authors. Application authors are only sharing code with themselves, and can be more lenient.

When a restriction enables you to make changes, the trade-off favors maximal restriction, to the point where you're shooting yourself in the foot of you don't restrict as much as possible. You can always loosen up later. If the default is open, were making that hard to do, and easy to fail.

But if a restriction enables other people to do something, like promotion, it's working against you being able to make changes, which would disable those features again. So, to avoid foot-gunning, you'd want ways to prevent even giving those features to other people to begin with, unless you've actively decided that is a deliberate feature of your API. The incentives for an author is against enabling such features, unless it's, after thorough consideration, considered worth it.

Authors want to optimize for maximal flexibility for themselves, and not grant any ability to others by accident, of that can turn into a problem later. I know I do.

Guess I'm saying that we should have a way to make closed the default. Not for everybody, but perhaps per library. final library;! That would make the defaults match the authors incentives, at least authors like me.

eernstg commented 1 year ago

First I should mention that I think value fields will provide support for API level stability. That's great! Also, that's the important part, and then we can always discuss the details (like whether or not to include support for stable get getters).

But I disagree about the absolute nature of some statements in this thread. For example:

@lrhn wrote:

Authors want to optimize for maximal flexibility for themselves ... I know I do.

But then every return type must be Object? because that reserves maximal flexibility for authors to return whatever they want. And every parameter type must be Never, because expressions of that type can be used in every way you'd like in the body of the method. And every class/mixin must prevent subtypes (using final, sealed, whatever it takes) because subtypes are highly breakable. And every member must be private because private members can be added, removed, or modified, and no code in other libraries will break. Not to mention that every top-level declaration must be private for the same reasons. And so on.

Obviously, it doesn't make sense to say in such absolute terms that software authors/maintainers wish to take away every possible affordance from clients in order to maximize their own flexibility.

It is always a trade-off between providing entities offering actual and useful affordances, and declaring constraints on those entities, constraining the ways in which a client can use it, or constraining the properties of the declared entity (and hence constraining the changes that the authors/maintainers can make without violating those constraints).

The constraints generally provide affordances in client code and/or in the declaration itself: A sealed type provides exhaustiveness, and a base type ensures inheritance (if maintained by other declarations in the same library). A return type guarantees that a certain interface is available with the returned object, and similarly for parameter types and actual arguments. Non-nullable types ensure that some null related dynamic errors cannot occur.

The real question is whether or not any given trade-off is useful.

My take on stability is that it is useful because it simplifies reasoning about code, and allows developers to focus much more on those (few) properties that actually vary over time. Promotion is just one example where the analyzer/compiler can make good use of the provided guarantees (again: the code allows for improved reasoning).

You could say that "static analysis is about types, and stability is not a type". However, we have actually changed the scope of static analysis several times over the years:

Nulllability introduced a distinction between null and non-null in the static analysis.
Local variable promotion was enhanced, introducing a large amount of flow analysis into the static analysis.
Flow analysis was used to ensure soundness as well ("cannot complete normally").
Type inference was introduced (and enhanced several times), introducing the notion of computed types in a program.
Sealedness introduced a library-level completeness property ("we know all of them").
Interface/base/final classes introduced a kind of meta-type on classes (extends C is only allowed if C has a meta-type that includes the capability "can be a superclass").

In other words, the kinds of information that the Dart static analysis is generating and processing has been expanded several times over the years, and the outcome has been an enhanced set of constraints and affordances, in declarations and in clients of declarations.

Adding stability as an API property would indeed be yet another extension of the scope of static analysis in Dart, but it's not the first one, and there will probably be others as well.

cedvdb commented 1 year ago

Probably dumb question: What is the reason for allowing non final overrides of final instance fields ? Could it be automatically promoted if that wasn't the case ?

class Other {
  final int? a = 4;
}

class Another extends Other {
  @override
  int? a = 5;
}

lrhn commented 1 year ago

What is the reason for allowing non final overrides of final instance fields?

The Dart object model's encapsulation is based on interfaces.

All a class knows about a class from another library is the interface it exposes (including which super-interfaces it implements), and maybe which public constructors are generative (because if you extend the class, your constructor needs to call their constructor).

In an interface, there is no such thing as a final field. A final instance variable introduces a getter into the interface, and no setter. But so does a getter declaration. There is nothing in the interface which distinguishes the two. That's deliberate. The entire point of having getters which can be accessed the same way as fields, is that you can change between being a field and being a getter, without anyone else needing to know.

Because of that encapsulation, it makes no sense to even ask whether a class is overriding a final instance field of a superclass, because it doesn't know whether the superclass has a final instance field.

We could make that information part of the interface, but at that point it would become a breaking change to change int get x => 4; to final int x = 4;, and maybe even vice versa. It would break getter/field symmetry. Which would (IMO) encourage people to write final int _foo; int get foo => _foo; to avoid revealing that their property is based on a field. That pattern is exactly what the Dart getters were designed to avoid, so it would be entirely counterproductive to break that symmetry.

(I'm starting to think that we're looking in the wrong direction with field-promotion and the generalization of stable-getter-chain promotion. The entire idea that you can do if (notLocalVariable != null) .... notLocalVariable.someMethod(); is misguided in a language with sound null-safety. It comes from languages which are not null-safe, where you are guarding yourself logically against a null error, even if it might not be guaranteed. I rarely see someone trying to do if (notLocalVariable is int) ... notLocalVariable.toRadixString(16), because non-null-safe languages wouldn't allow that either. We should instead be guiding people towards patterns that actually work in a null-safe language, and make sure we support such patterns efficiently and concisely. Currently if (notLocalVariable case var v?) v.someMethod() is one approach. It requires giving the variable a name. An if-variable like if (var notLocalVariable != null) notLocalVariable/*it is, though*/.someMethod() could allow you to easily reuse the existing name. A guard-let could reduce the clutter of an if when the else branch would bail out anyway. But promoting arbitrary expressions is trying to solve a problem in a way that would have worked in a non-null-safe language, and which already has a solution in null-safe Dart. We should just make that solution more streamlined, if necessary, and then push people towards it.

Also, any time I see if (foo.bar.baz != null) { foo.bar.baz.someMethod(); }, I can't help ask "Why are you doing foo.bar twice? Why are you doing .baz twice? Does time have no meaning to you?!?" We'd be helping those people by pushing them towards if (foo.bar.baz case var baz?) { baz.someMethod(); }. And helping me when I read that code.)

eernstg commented 1 year ago

@cedvdb wrote:

What is the reason for allowing non final overrides of final instance fields ? Could it be automatically promoted if that wasn't the case ?

I'll add a comment here because I think a part of the story is missing. Dart currently allows the getter of a final instance variable to be overridden by the getter of a non-final instance variable, and this does indeed prevent promotions. If promotions did take place then we could have this scenario:

class A { final num n; A(this.n); }
class B implements A { num n; B(this.n); }

void main() {
  B b = B();
  A a = b;
  if (a.n is int) { // Assume that this promotes `a.n` to `int`.
    b.n = 1.5; // Static analysis cannot know that `a` and `b` is the same object.
    a.n.isEven; // Run-time error: `double` does not have an `isEven`!
  }
}

Aliasing (such as whether a and b refers to the same object) is not a decidable problem in general, so we can't hope to detect that a statement like b.n = 1.5 may invalidate the promoted type of a.n. Another undecidable problem is what code we'll execute when an instance member (indeed any instance member on any receiver) is invoked, so we could run a statement like e.n = 1.5 somewhere else, and e might evaluate to the same object as a once again. In general, we just can't promise anything about the stability of the value of a mutable instance variable.

So if we wish to ensure that a final instance variable is promotable then we cannot allow its getter to be overridden by a getter that may return different values on different invocations.

This is exactly what val does:

Arguably, the modifier final on an instance variable is a design flaw in the language because it's an implementation detail that clients must ignore. They can see that this is a final variable, but that doesn't buy them anything at all because it could be overridden by any kind of getter. It doesn't even say whether or not there's a corresponding setter, because that setter could be declared separately.
In contrast, val is the new final that actually means something: The getter of a val variable can be evaluated multiple times, and it will always return the same result when called on the same receiver. We achieve this guarantee by putting constraints on the allowable implementations/overrides of the getter of that variable.

When stability is an API property it becomes a design decision: "I'm writing this new class, and it has a property foo. By design, that property is immutable (that is: stable). Hence, I'm willing to use a val modifier on the declaration of foo." This implies that clients can rely on this property being immutable. For instance, they can safely cache it, and it can be promoted.

Of course, it's possible that nobody will ever use val because everybody insists that "Some subtype will want to return different values from the foo getter now and then". However, this is also a trade-off because this means that clients will have to call foo again and again if it's important for correctness that they do something in the case where the value changes. In short, stable/immutable properties are simpler to use correctly in client code than mutable ones.

It's similar to return types: If you declare that an instance method bar has return type String then you have promised (on behalf of every subtype as well as yourself) that bar will return a String. If you want to insist on the ability of subtypes to return null (or 42, or whatever) then you should not commit to that return type. For optimal flexibility you should use Object? or dynamic as the return type of every method and function in your program. Now you have the full freedom to change what you're returning, and so do all overriding/implementing declarations (where the return type would of course also be rather general).

But we don't do that, we do commit to many, many promises all the time when we're declaring APIs. Hence, it's not always better to reserve more freedom for implementers/overriders, and give fewer promises to clients. It's a deeply non-trivial trade-off, and your clients won't be happy if you insist on never promising anything!

cedvdb commented 1 year ago

Edit: after thinking about it I do not stand behind this comment, val with stable getters would tick most boxes for me.

Original:

The idea of having final and val in the same language introduces the following disadvantages for me:

it seems weird that there would not be a val keyword for variable declaration. If there was it's unclear what the difference would be with final
it introduce a choice, I can spend an unhealthy amount of time pondering stuff like this. (Edit: If val is interchangeable with stable getters however that choice is pretty clear cut to me.)
it makes if promotion code behave differently as you may or may not have to redeclare a local variable.
I'm worried about all the different "styles" of declaring a field. Some people will be advocating to stay on the safe side you-might-need-a-getter everywhere and others no-you-dont-you-should-go-for-automatic-promotion everywhere.

All in all, I prefer the language team to make that choice for me and final is already established. I wouldn't mind having val instead.

it simplifies reasoning about code

Beside the compiler, why is that the case for an end user ? How does the end user even have this information without opening the API source code ?

eernstg commented 1 year ago

it simplifies reasoning about code

... How does the end user even have this information without opening the API source code ?

Do you need to check all implementations of operator == in order to know that it returns a bool? Nope, we just declare that return type and then we leave it to the compiler to make sure that it's true at run time.

Similarly, why would you need to check implementations to see whether it's true that a property is immutable, if the compiler performs a strict and sound static check to verify that the explicitly declared immutability actually holds?

I believe the real question is whether or not immutability is useful enough to make it an API property (explicitly declared and backed up by a sound static check). My gut feeling is that it is sufficiently useful, because it systematically and pervasively reduces the complexity of coding. If many properties are explicitly declared to be immutable then we can put a much stronger focus on those much fewer remaining ones which may change at any time.

cedvdb commented 1 year ago

immutability is useful enough to make it an API property (explicitly declared and backed up by a sound static check).

Since I was negative in my last comment I feel the need to say that I'm with you there, val interchangeable with stable getters (didn't know this was a thing) brings the best of both worlds to me. I'd remove final though.