dart-lang / language

Design of the Dart language
Other
2.68k stars 205 forks source link

A post constructor call-back (finalizer) would be a nice addition #3745

Open escamoteur opened 7 months ago

escamoteur commented 7 months ago

While adding some mixins to our project I realized, that quite often you want to have some initialization action run on the mixin or on a parent class that expects that some of its members will be overridden by inheriting classes. currently, the only way is to ensure that the class whose constructor is executed last, has to call an initfunctions on parent classes/mixins.

So it would be nice if classes could have a finalizer function and the compiler ensures that all finalizers are called once the object is fully created (all constructors are executed). that would make mixins safer to write as you don't have to rely on anyone calling its initfunction

lrhn commented 7 months ago

I'd rather give mixins constructors. Something like https://github.com/dart-lang/language/issues/1605#issuecomment-1543650950

That's how you initialize an object. Doing it after the object has been created risks a worse design (fields not final that should really be final) and error-prone (object exists in inconsistent/not-fully-initialized state).

escamoteur commented 7 months ago

Ok I elaborate on why I think a constructor isn't able to do the same

mixin mixinA {
  late int i;
  void init() {
    i = getValue();
  }

  int getValue();
}

class A with mixinA {
  A() {
    init();
  }
  @override
  int getValue() {
    return 10;
  }
}

class B extends A {
  B() {
    init();
  }
  @override
  int getValue() {
    return 220;
  }
}

As we have no way to pass values from a class constructor to a mixin the only way to get values into a mixin is by calling an init() function inside the mixin, either passing in values as parameters or letting the init()function call some getters or functions that are defined either in a parent type of the mixin or, which is actually pretty need, via abstract functions inside the mixin that have to be implemented by any class that the mixin uses like in the above example. By this, it is even possible to access values in classes further down the inheritance hierarchy.

The problem however is that this init function has to be called somewhere and depending on in which constructor of inheriting classes it is called the mixin might receive different values. Following the idea that always the overridden operations are the ones that are called unless they include a call to super it would make sense in my opinion that it could be guaranteed that that one is also called to initialize the mixin.

If the init function would be called by a constructor of the mixin the question would be: when is that constructor called during creation of class A or when creating an instance of class B. further more, if there are more than one mixin, in which order would their constructors be called? If we treat mixins like normal classes the mixin constructor would be called before the constructor of A which would mean that A isn't yet created , so getValue could not be called from the constructor. If the mixin constructor would be called after the constructor of A it could call getValue but which getValue would it call? the one of A or B and if it would call the one of B , has the constructor of B then already executed?

With a finalizer() method defined in the mixin this would be clear as it would always only be executed after all constructors have been run and it would always call getValue of B if the instance is of type B

This would also allow base classes to be initialized by calling overridden functions/properties of derived class instances which is currently not possible while guaranteeing that the constructor of the derived class has been called before.

As finalizers should always only update the class where they are defined the order in which they are called doesn't matter.

The argument about final is the same if you try to assign final variables inside a constructor now, it's not possible, which I find is a flaw as it requires them to be made late. We could imagine allowing write access to final properties once either inside the constructor or a finalizer. The problem with not fully initialized objects, we have right now with late properties already and currently, we can't even ensure that the init function of the mixin is called at all.

Overall a finalizer() would make it possible to implement fully contained mixins that could be initialized reliably

lrhn commented 7 months ago

If we add constructors to mixins, you would also get a way to call them (and probably be required to call them) in the subclass constructor. Obviously if getting the value to initialize with is an instance method call on the same object, you can't do that until after initializers have run.

First of all, today I'd write the example above as:

mixin mixinA {
  late int i = getValue();
  int getValue();
}
class A with mixinA {
  @override
  int getValue() => 10;
}
class B extends A {
  @override
  int getValue() => 220;
}

It won't initialize the field as eagerly, but it's a late field, so that shouldn't matter.

If I had mixin constructors (the way I envision them, which I think is in the link I gave), I'd probably write it as:

mixin MixinA {
  int i;
  MixinA(this.i);
}
class A with MixinA {
  A() : this._(10);
  A._(int i) : MixinA.super(i), super();
}
class B extends A {
  B() : super._(220);
}

or something like that. If you are designing the classes together, you can use private constructors to hide the value passing. If not, you do rely on class A allowing you to parameterize the value somehow.

If you want to keep the current behavior as close as possible, then maybe I'd write it as:

mixin MixinA {
  late int i;
  MixinA() {
    i = getValue();
  }
  int getValue();
}
class A with MixinA {
  // Implict default constructor in the presence of mixins becomes:
  // A() : MixinA.super(), super();
  @override
  int getValue() => 10;
}
class B extends A {
  // Implicit default constructor is the usual one:
  // B() : super();
  @override
  int getValue() => 220;
}

The body of a mixin constructor is a function body that gets called after the object was created, before the object construction expression returns.

If we treat mixins like normal classes the mixin constructor would be called before the constructor of A which would mean that A isn't yet created , so getValue could not be called from the constructor.

That's not how constructor bodies work in Dart. The initializer lists do work that way, you cannot access the this object, but constructor bodies are run after the object has been created and validly initialized. They can do anything to the this object that you can do from the outside. Here the A constructor body will be run after the mixin constructor body, but both after the object has been created and getValue is available.

If the mixin constructor would be called after the constructor of A it could call getValue but which getValue would it call? the one of A or B and if it would call the one of B , has the constructor of B then already executed?

This is Dart, not C++ or Java. If you have access to the object, through this or any other reference, then the object is complete. There is no situation where calling getValue() on an object created using new B() would not call the getValue method of B (well, except class B doing super.getValue() deliberately, which only it can do).

Generally when the constructor body of class X runs, it can assume that the object has been created and initialized, and that the bodies of all the superclass constructors have already run. The object is fully formed, and post-initialization operations of all superclasses are done.

escamoteur commented 7 months ago

This is Dart, not C++ or Java. If you have access to the object, through this or any other reference, then the object is complete. There is no situation where calling getValue() on an object created using new B() would not call the getValue method of B (well, except class B doing super.getValue() deliberately, which only it can do).

But when an instance of B is created, if the ctor of A is calling ´getValue()´ it will call B.getValue but at this point, the ctor of B hasn't run yet, so B might not have been fully initialized.

Actually by pointing out the use of a ´late´ property inside the mixin to trigger an init function that would allow to have an init function being executed after its instance has completely be created (given that that property isn't accessed by any ctor) solves what the initializer´ was meant for. However using late to execute a initfunction that could have side effects is pretty hidden which is why I still think anfinalizer´or maybe a "late constructor" would make this more explicit.

lrhn commented 7 months ago

But when an instance of B is created, if the ctor of A is calling ´getValue()´ it will call B.getValue but at this point, the ctor of B hasn't run yet, so B might not have been fully initialized.

True. If the B constructor body needs to run to ensure that the object is fit to be used, then prior constructors cannot use the object. That's a bad state to be in. The object exists, but isn't usable. That's why it's so much better to initialize things in the initializer list if at all possible. And if a superclass needs/wants to call a method introduced by a subclass, and one that only works after the subclass constructor body has run, then it's not the subclass's job to do the calling.

Dart constructors have two phases: Subclass-to-superclass initialization in the initializer list, superclass-to-subclass post-initialization operations in the body.

You're asking for a third phase, run after all bodies have executed. I don't see that happening. If your object initialization is so complicated that it needs three phases, you need to either reconsider some choices, or introduce an abstraction (say, only allow construction through a factory, which calls the constructor and the post-construct-setup before returning the object.)

It's a program design issue, suggesting that the modularization boundaries are not optimal, when information has to pass both ways between a superclass and a subclass.

That is not something that I'd want to complicate the language with, adding a third phase to all constructor invocations, because the wast majority of classes do not have the problem, or have found ways around it.

Consider something like:

mixin MixinA {
  late int value;
  void _init() {
    value = getValue();
  }
  int getValue();
}
class A with MixinA {
  final String something;
  A._(this.something);
  A(this.something) { _init(); }
  int getValue() => 10;
}
class B extends A {
  late int _theValue;
  int getValue() => _theValue;
  B._(super.something) : super._();
  B(super.something) : super._() { _theValue = 220; _init(); }
}

That is: Public constructors that call the inherited _init once when they are done, chaining to private constructors that just initialize and doesn't call _init.

This ensures all object constructions call _init as the last thing, once.

(Or alternatively, rather than putting the _init in the generative constructor body, have private generative constructors and public factory constructors, where the factory calls the generative constructor to create the object, then calls _init on it before returning the object.)

There are designs that give you what you want, but it is a complicated thing that you want, and I'm OK with it requiring some design work. I don't see it as reason enough to add a langauge feature.

escamoteur commented 7 months ago

Looking at how complex the solution above looks like, compared to an additional third step that calls a clearly defined finalizer that much more clearly expresses what's going on, the advantages should be obvious. If most classes won't use it, then they wouldn't have to uses it.

lrhn commented 7 months ago

Sure, a third phase will solve this problem. Until someone comes up with a problem that requires a fourth phase. Eventually we'll want our generative constructors to be able to create cycles of objects, and have initialization that runs around the cycle until some fixed point of state is achieved. Or we can say that that's not the job of constructors.

We're past creating objects, a point we passed the moment we started executing bodies, because at that point the object was initialized as far as the language is concerned. We're well into implementing protocols for classes to communicate with each other up and down the subclass chain. There is no end to the possible complexity of that, which is why it's something to handle in code, not by adding specialized language features for each possible complication.

The example here is so simplified that I can come up with many different designs that solve the same issue. In a real situations, the logic to initialize the field is probably more complicated and local to the mixin. It just needs some values. And it's too soon to call _init until those values are available.

In many cases, you don't call init until much later (ngInit), because the values it requires are asynchonously produced, so even a third phase running right after the constructor won't solve that problem.

And we're back to it being so complicated that it needs to be solved in code, not language features, because there is not one size that fits all.

And another design, while I'm here:

mixin MixinA {
  int get value;
}
class A with MixinA {
  int get value => 10;
}
class B exends A {
  late int value;
  B() { value = 220; }
}

That is: Why does the mixin have to initialize a field, why can't it just define the field and let subclasses initialize it? (Again, because the example is simplified. But if you give me four real examples from different contexts, I'll bet they need at least 2.5 new language features to solve all of them.)

escamoteur commented 7 months ago

Why does the mixin have to initialize a field, why can't it just define the field and let subclasses initialize it?

Because, that way the mixin can't enforce that subclasses will initialize it. By defining abstract getters it can enforce subclasses to implement them. I can solve that by checking inside the properties of the mixin if it has already been initialized and do the initialization if needed. But again that is not an expressive way to do it. (that is the same for any super class)

honestly I feel you are getting a bit polemic with your argument that the next one will require a 4th phase. I laid out why a third phase makes sense. A 4th doesn't make any sense.

lrhn commented 7 months ago

I can see that a third phase here will solve your problem.

I don't necessarily believe that your problem is common enough that it warrants a language-based solution. I'm also not sure that if it does, adding a third phase to constructor invocations would be the solution I'd go for.

To be concrete:

Let's design this around constructors, as a genuine third phase of constructors:

Here an object creation expression of B(21, 10) will execute as:

This works, but it's highly tailored to the specific problem, and it's not obvious that it generalizes to many other cases. It definitely doesn't work with asynchronous initialization. Nothing using constructors can be asynchronous. (We could introduce asynchronous constructors, but we won't. Just use a static function.)

All this because of a strong dependency between superclass and a subclass. Something that could perhaps be avoided by restructuring the code, or just by requiring the user to call init() after creating the object. It won't be automatic, but it can work.

You can't do everything in the world inside a constructor. Its job is to initialize the object, synchronously, to a point where the object can safely be provided to others. If the constructor cannot get to that point, because it needs to do more things after the object has been fully created, then exposing a generative constructor might just not be the correct model.

/// Classes mixing in this mixin must have `init` called after object creation.
mixin MixinA {
  late int i;
  void init() {
    i = getValue();
  }
  int getValue();
}

abstract base class ABase with MixinA {
  ABase();
  @override
  int getValue() {
    return 10;
  }
}
final class A extends ABase {
  A() {
    init();
  }
}

abstract base class BBase extends ABase {
  BBase();
  @override
  int getValue() {
    return 22 * super.getValue();
  }
}
final class B extends BBase {
  B() {
    init();
  }
}

This design ensures that nobody can create an instance of A or B without having init called on it. They can extend A or B, and then it's up to the subclass to remember to have init called. This is API design, not class design. It includes class design, but doesn't try to put all initialization into the generative constructor, instead it provides a factory constructor as part of the API.

(In general, I discourage extending concrete classes, precisely because it makes initialization ambiguous. A concrete class needs its initialization to create a fully initialized object, but a base class should be ready for subclass initialization to do something afterwards. It's not always reasonable for a class to be prepared for both, and a concrete class may want to only expose factory constructors precisely because it wants to do something after object creation, something that an abstract class doesn't need. There are exceptions - mainly data classes taht are so primitive that the concrete class will never do anything extra. But for complicated classes, I'd say including widgets, combining a base class and a concrete class into one is setting yourself up for trouble.)

You may not like this (I don't particularly like it), but given the constraints, it's well within the acceptable design space. I don't see us adding a language feature just to avoid this kind of complication, if the problem to solve is strong temporal dpendencies between superclasses and subclasses that go counter to the execution order of constructor bodies. That just suggests to not use constructor bodies for the job.

escamoteur commented 7 months ago

ok, not sure if this really is such a rare situation. My goal was mainly to guarantee that the init function will get called at the right moment and can't be forgotten.

rubenferreira97 commented 7 months ago

Would a more general concept like https://github.com/dart-lang/language/issues/2831 help you in this example?

escamoteur commented 7 months ago

Not sure but I'm thinking, what about a new attribute like @mustBeCalledByDerivedClasses or something like that. Then the analyzer could warn users not to forget such a call

escamoteur commented 6 months ago

just saw that there is indeed a @MustBeOverridden