Primary constructor on classes

dart-lang / language

Design of the Dart language

Other

2.65k stars 201 forks source link

Primary constructor on classes #2364

Open leafpetersen opened 2 years ago

leafpetersen commented 2 years ago

[Update: There is considerable interest on the team in adding primary constructors as a general feature. This original discussion issue has been repurposed as the tracking issue for the general feature request.]

Introduction

Primary constructors is a feature that allows for specifying one constructor and a set of instance variables, with a concise and crisp syntax. Consider this Point class defined using the current class syntax for the constructor and fields:

class Point {
  int x;
  int y;
  Point(this.x, this.y);
}

With the primary constructor feature, this class can defined with this much shorter syntax:

class Point(int x, int y);

Discussion

[original issue content below]

In the proposal for structs and extension structs, I propose to add primary constructors to structs. Briefly, the class name (or the type parameter list if any) may/must be followed by a parenthesized list of variable declarations as such:

  struct MyStruct(int x, int y) {
      // members here
  }

In the struct proposal, these are always final by default, and are restricted in various ways (i.e. they may not be late, they may not be const). They are allowed to declare initializers, which are used to generate default initialization values.

This issue is to discuss the possibility of splitting this out, and making it a general feature for classes as well.

Initial points in favor of this include:

It would be consistent, and nice to have this available for classes
It makes structs less different from classes

Initial points against include:

Resolving the tension between the desire to have final by default for structs, and the existing mutable by default behavior in classes.
Classes support richer superclass structure that may make specifying how the generated constructor works more complicated
Dealing with const constructors here seems more complicated than in the data class case.

leafpetersen commented 2 years ago

cc @mit-mit @lrhn @eernstg @chloestefantsova @johnniwinther @munificent @stereotype441 @natebosch @jakemac53 @rakudrama @srujzs @sigmundch @rileyporter @mraleph

lrhn commented 2 years ago

I'd say "yes".

If it works, it feels odd to not allow it. I can't see a reason it shouldn't work. It's a little weird that it only works with final instance variables, but I think that's acceptable.

But, that depends a lot on what model/capabilities we end up with for the fields.

The current model uses the primary constructor to declare fields, and only secondarily as a template for a default constructor. That is, it's not necessarily a "constructor". You can also write other constructors, and initialize fields directly in those.

We could consider a different model where the "primary constructor" directly defines the unnamed constructor. That means changing the (...) to a parameter list instead of a list of field declarations, and then deriving the field declarations from the parameters, instead of the other way around. (And then you can perhaps even extend another non-abstract struct, and forward parameters using super.foo syntax.) Otherwise people will need to write a constructor anyway, when they want it to have named parameters. I think that'd be a lost opportunity for a shorthand syntax for classes, where you don't need to write a constructor directly.

With that, I'd also say that any other generative constructor must be redirecting (eventually) to the primary constructor. Since the primary constructor initializes all fields. (Or, we can allow adding a name, class Foo._(args) {}, to make the primary constructor be a named, possibly private, constructor.)

If we do allow the syntax for classes too, I'd also have those classes get the default == and hashCode implementations (if they don't inherit or declare an implementation of either other than the one from Object). That makes sense because the primary fields are known. (Not sure that works if you extend another kind of class, though, because you can't just delegate to super.==).

A class declaration with a primary constructor can add further instance variables, but they must be self-initializing (nullable or having an initializer). Those other variables would also not be part of the == or hashCode implementation. (If you need that, you must declare the ==/hashCode yourself.)

eernstg commented 2 years ago

Yes, please! ;-)

We could use the following rule: The primary constructor of a struct declares non-late instance variables that are final by default. The primary constructor of a class declares non-late instance variables that are non-final by default, but may have the keyword final.

Classes can still declare additional instance variables, of any kind, as before. Structs might be able to do this as well—what's the harm in allowing a struct to have final int now = DateTime.now().millisecondsSinceEpoch;?

The primary constructor syntax gives rise to a constructor declaration and a set of instance variable declarations. Every constructor is checked statically relative to the result of this desugaring step.

In other words, there is no reason to require that all constructors are redirected to the generated primary constructor, they just need to satisfy the normal constraints that we have today (e.g., that a non-late final instance variable must be initialized before any code with access to this runs).

About this:

changing the (...) to a parameter list instead of a list of field declarations, and then deriving the field declarations from the parameters, instead of the other way around

I think that's a very interesting idea to explore.

leafpetersen commented 2 years ago

With that, I'd also say that any other generative constructor must be redirecting (eventually) to the primary constructor. Since the primary constructor initializes all fields.

Not sure I follow this. I was proposing to allow other generative constructors, they just must also initialize all of the fields as usual.

(Or, we can allow adding a name, class Foo._(args) {}, to make the primary constructor be a named, possibly private, constructor.)

On reflection, I was thinking of modifying the proposal to say that if there is a "primary constructor", then there is always a Foo._ constructor generated, and if there is no explicit default constructor, then one is generated that redirects to ._.

In other words, there is no reason to require that all constructors are redirected to the generated primary constructor, they just need to satisfy the normal constraints that we have today (e.g., that a non-late final instance variable must be initialized before any code with access to this runs).

This was the model I had in mind.

mraleph commented 2 years ago

I totally agree that this syntax should be applicable to classes as well. You can make an argument that it is good enough if it only supports simple cases, e.g.

class X (var x, int y, final String z, {super.key}) extends Y {
  final int w = x + y;
}

// equivalent to 

class X extends Y {
  var x;
  int y;
  final String z;
  final int w;
  X(this.x, this.y, this.z, {super.key}) : w = x + y;
}

and for anything else people can resort to traditional constructor syntax.

You can probably make const work as well if you say that class with a primary constructor of form (final f, ..., final z) (all fields final) automatically gets const constructor.

We can maybe even make something like:

class X (var x, int y, final String z, {key}) extends Y(x, key: key) {
  final int w = x + y;
}

work.

The weakest point of this is going from shorthand syntax to long syntax once you realise that you need constructor body, but maybe this rarely happens.

munificent commented 2 years ago

I 1000% want any primary constructor syntax to be generalized to classes. In fact, I personally care more about that than I care about the entire views proposal. :) Most user-types do not have value semantics (==, hashCode) but do have simple enough constructors that they could use this syntax.

To validate that, I scraped a big corpus of code from itsallwidgets.com, open source Flutter apps, and pub packages (18+MLOC) to determine which classes with at least one generative constructor could not use a proposed primary constructor syntax. The reasons a class might not be able to use a primary constructor sugar that I considered are:

"Multiple generative ctors": There are multiple generative constructors. In practice, many of these classes could still probably use a primary constructor for one of them, but these were rare enough that I didn't bother trying to distinguish them. So consider the results below a slight undercount.
"Non-empty body": The constructor body isn't empty.
"Non-forwarded superclass ctor param": The constructor has a super() constructor initializer that passes an argument that isn't simply a forward from a constructor argument. (In other words, there's a superclass constructor argument that couldn't use super. instead.)
"Non-forwarded field initializer": The constructor has a field initializer that isn't simply a forward of a constructor parameter. (In other words, there's a constructor initializer that couldn't use this. instead.)

The results are:

-- Could use primary (109448 total) --
  82826 ( 75.676%): Yes
   9254 (  8.455%): No: Non-forwarded superclass ctor param
   7172 (  6.553%): No: Non-empty body
   4568 (  4.174%): No: Multiple generative ctors
   3859 (  3.526%): No: Non-forwarded field initializer
    935 (  0.854%): No: Non-empty body, Non-forwarded field initializer
    442 (  0.404%): No: Non-empty body, Non-forwarded superclass ctor param
    338 (  0.309%): No: Non-forwarded field initializer, Non-forwarded superclass ctor param
     54 (  0.049%): No: Non-empty body, Non-forwarded field initializer, Non-forwarded superclass ctor param

So a little more than 3/4 of all existing class declarations could use something close to the proposed primary constructor syntax. Note that I'm assuming here that a primary constructor syntax would support users controlling which parameters are positional, named, and/or optional and would allow private names. (In other words, I did not treat those as failures.)

I think the biggest design challenges are:

Whether fields should default to final or not. It's pretty obviously the right default for struct, but I think would be surprising for class. (Ideally, we would have always defaulted to immutable for fields and parameters, but that ship has sailed.)
How to make the syntax readable when there are many fields, doc comments, extends clauses, implements, with, etc. It can get pretty hairy to pack all of that into the header of a class.

A while back, I worked on a primary constructor strawman syntax that looked like:

class Rect new (
  final int x,
  final int y,
  final int width,
  final int height,
);

So instead of a parameter list right after the class name, there is a new keyword first. Having a keyword there allows a few things:

You can put it after the other header clauses. Since the field list is likely longer than the extends clause, type parameters, etc. I think it looks best last right before the class body, as in:

class ArgumentSublist extends Rule<Expression> implements FormatSpan new (
  /// The full argument list from the AST.
  final List<Expression> _allArguments,

  /// The positional arguments, in order.
  final List<Expression> _positional,

  /// The named arguments, in order.
  final List<Expression> _named,
) {
  /// The number of leading block arguments, excluding functions.
  ///
  /// If all arguments are blocks, this counts them.
  final int _leadingBlocks;

  /// The number of trailing blocks arguments.
  ///
  /// If all arguments are blocks, this is zero.
  final int _trailingBlocks;

  void visit(SourceVisitor visitor) { ... }
}

Compare that to what you'd get using the current proposal:

class ArgumentSublist(
  /// The full argument list from the AST.
  final List<Expression> _allArguments,

  /// The positional arguments, in order.
  final List<Expression> _positional,

  /// The named arguments, in order.
  final List<Expression> _named,
) extends Rule<Expression> implements FormatSpan {
  /// The number of leading block arguments, excluding functions.
  ///
  /// If all arguments are blocks, this counts them.
  final int _leadingBlocks;

  /// The number of trailing blocks arguments.
  ///
  /// If all arguments are blocks, this is zero.
  final int _trailingBlocks;

  void visit(SourceVisitor visitor) { ... }
}

Note how the extends and implements clauses are buried in the middle.

You can use different keywords. In my strawman, you could use const instead of new to make the primary constructor a const constructor. We could also allow you to use final to default to making all fields final. So the first example becomes:

class Rect final (
  int x,
  int y,
  int width,
  int height,
);

We could then do the same thing for struct which would allow you to define value types with mutable fields. (Which are, admittedly, dubious, but a thing users do in practice.)

In other words, this means the only thing writing struct instead of class does is give you default implementations of ==, hashCode, etc.

You can provide a constructor name. Having a keyword before the parameter list instead of the class name also provides a natural place to insert a constructor name if you want the primary constructor to be named:

class NestingLevel extends FastHash new.empty(
  /// The nesting level surrounding this one, or `null` if this is represents
  /// top level code in a block.
  final NestingLevel? parent,

  /// The number of characters that this nesting level is indented relative to
  /// the containing level.
  ///
  /// Normally, this is [Indent.expression], but cascades use [Indent.cascade].
  final int indent,
) {
  /// The total number of characters of indentation from this level and all of
  /// its parents, after determining which nesting levels are actually used.
  ///
  /// This is only valid during line splitting.
  int get totalUsedIndent => _totalUsedIndent!;
  int? _totalUsedIndent;
}

The downside, of course, is that this is a bit more verbose and a little different coming from other languages whose primary constructor is right after the class name. In cases where there isn't much else in the type header, there are few fields, and they aren't documented, I think the classic primary constructor syntax looks better. But once the type scales up (and in particular, once you document your fields, which I think is generally a good idea), it gets kind of hard to read.

leafpetersen commented 2 years ago

Compare that to what you'd get using the current proposal:

This is assuming no changes to documentation conventions, which I think is not realistic. From a brief look at some kotlin code, the equivalent might look more like:

  /// @param _allArguments The full argument list from the AST.
  /// @param _positiional The positional arguments, in order.
  /// @param _named The named arguments, in order.
class ArgumentSublist(
  final List<Expression> _allArguments,
  final List<Expression> _positional,
  final List<Expression> _named,
) extends Rule<Expression> implements FormatSpan {
  /// The number of leading block arguments, excluding functions.
  ///
  /// If all arguments are blocks, this counts them.
  final int _leadingBlocks;

  /// The number of trailing blocks arguments.
  ///
  /// If all arguments are blocks, this is zero.
  final int _trailingBlocks;

  void visit(SourceVisitor visitor) { ... }
}

Which looks fine to me (nit, I don't understand how the extra fields work in this class, since they're not initialized in the constructor?)

One way of looking at this is that intuitively, we write Foo<X, Y> for generic classes, and the intuition is basically that the type parameters are "parameters" to the class. And at the invocation site, you write them in the same place: Foo<int, int>(...arguments). The same intuition seems to me to carry over naturally: the constructor parameters are parameters to the class, and in an invocation, you put the arguments immediately after the generic parameters (or the class name if none). So using the "parameter" syntax immediately after the classname/generics seems very intuitive to me.

You can use different keywords. In my strawman, you could use const instead of new to make the primary constructor a const constructor.

Is there any reason not say that every primary constructor is a const constructor (at least if the superclass has a const constructor)?

We could also allow you to use final to default to making all fields final. So the first example becomes:
class Rect final (
  int x,
  int y,
  int width,
  int height,
);
We could then do the same thing for struct which would allow you to define value types with mutable fields. (Which are, admittedly, dubious, but a thing users do in practice.)

We could. It looks pretty weird to me though.

You can provide a constructor name.

This really feels a bit over-generalized to me. If you want a named constructor, just write the constructor.

The downside, of course, is that this is a bit more verbose

This is really the rub. My sense is that the more we generalize this, the more we lose the actual benefits. Your data scraping suggests that a huge majority of classes don't need the generality. So every bit of generality that we add that makes that majority more verbose has a massive incremental cost in aggregate, and only benefits a few niche cases.

lrhn commented 2 years ago

Is there any reason not say that every primary constructor is a const constructor (at least if the superclass has a const constructor)?

We'd need to give you a way to opt out of being a const constructor if you don't want it. You may not want it if you plan to add further, non-final, fields to the class in the future. That will be a breaking change if the constructor is implicitly made const without you asking for it. In general, locking people into a constraint by default is dangerous. Even more to people who don't know about it. Those who do can usually come up with a workaround.

... every bit of generality that we add that makes that majority more verbose has a massive incremental cost in aggregate, and only benefits a few niche cases.

That's a very good point. The only counter-point is that ever feature we make default and automatic causes an extra step if you ever need to migrate away from the shorthand. If we make a primary constructor implicitly const, you need to remember to write const when you migrate off using primary constructors. (That's probably the smallest such issue, so not really an argument for not making it default to const. Not having an opt-out other than migrating away from the primary constructor is a bigger issue to me).

mraleph commented 2 years ago

We'd need to give you a way to opt out of being a const constructor if you don't want it.

FWIW I think that majority of Dart developers don't concern themselves with such matters because they are not writing reusable code.

So I think we should not optimise defaults towards the minority that does.

leafpetersen commented 2 years ago

@ lrhn

We'd need to give you a way to opt out of being a const constructor if you don't want it. You may not want it if you plan to add further, non-final, fields to the class in the future. That will be a breaking change if the constructor is implicitly made const without you asking for it. In general, locking people into a constraint by default is dangerous. Even more to people who don't know about it. Those who do can usually come up with a workaround.

To be slightly provocative, maybe the answer is to say "if you don't want it to be const, don't use a primary constructor". As @mraleph says, I think there is a lot of value for a feature like this that you don't have to use in optimizing strongly for the common case.

To be slightly less provocative, we could at least say that getting an implicit const constructor is part of the deal with structs/data classes. That is, if you say data class, you are opting in to implicitly final fields and implicit const constructor.

Not having an opt-out other than migrating away from the primary constructor is a bigger issue to me

I hear this, but I also think that there is an inherent cliff here. If you want a constructor body, you have to migrate away. If you want to delegate, you have to migrate away. If you want to initialize some fields in the initializer list, you have to migrate away. So saying that if you want non-const you have to migrate away doesn't feel that bad to me.

munificent commented 2 years ago

This is assuming no changes to documentation conventions, which I think is not realistic.

That's a good point. Hoisting all the field docs alleviates much of my readability concerns.

One way of looking at this is that intuitively, we write Foo<X, Y> for generic classes, and the intuition is basically that the type parameters are "parameters" to the class. And at the invocation site, you write them in the same place: Foo<int, int>(...arguments). The same intuition seems to me to carry over naturally: the constructor parameters are parameters to the class, and in an invocation, you put the arguments immediately after the generic parameters (or the class name if none). So using the "parameter" syntax immediately after the classname/generics seems very intuitive to me.

Yeah, I agree it is 100% intuitive to have the parameters right there. I just think it looks funny when you end up having the extends/implements/with clauses jammed between the primary constructor and the class body. But... I'm convinced that it's the least bad approach.

Is there any reason not say that every primary constructor is a const constructor (at least if the superclass has a const constructor)?

No, good point.

This is really the rub. My sense is that the more we generalize this, the more we lose the actual benefits. Your data scraping suggests that a huge majority of classes don't need the generality. So every bit of generality that we add that makes that majority more verbose has a massive incremental cost in aggregate, and only benefits a few niche cases.

Yes, I think I'm sold. I've poked around a bunch of Kotlin code and it does look weird to me to have the superclasses and superinterfaces wedged between the primary constructor and class body. But in practice, it seems like most classes with complex inheritance hierarchies don't use primary constructors. For those that do... it looks a little weird (and people seem to format them in a variety of creative ways), but not intolerable.

OK, so what I'd suggest then is:

A primary constructor is a parameter list that appears directly after the struct or class name. It can have positional, optional, named, and required parameters as the user wants.
Each parameter in the list (that isn't a super. parameter) becomes a field on the type initialized by that parameter. The field is implicitly final in a struct and final if the parameter is marked final in a class.
It can contain super. parameters which implicitly get forwarded to the superclass constructor the way they do in a normal constructor declaration.
The primary constructor is implicitly const.
The class may define other constructors (generative, redirecting, or factory) as long as those constructors meet all of the normal obligations of initializing final fields, etc.
The class may also define other fields as long as it doesn't cause problems that the primary constructor doesn't initialize them: they are either initialized at their declaration, late, or nullable.
A class can omit its {} body and use ; instead if empty.
It's probably reasonable to do what @lrhn suggests and allow a constructor name before the parameter list too:
```
class Foo.name(int x);
```

There's the weird wrinkle around private named fields as named parameters in the primary constructor. I think I'd be OK with saying that you just can't do that.

lrhn commented 2 years ago

What @munificent says.

Parameter list occurs after class name — and after type parameters if any.

I'm actually, uncharacteristically, fine with allowing the field names in the parameter list to be private, and automatically make them public in the implicitly added constructor. It's reasonable to want private fields, and unreasonable to have private parameter names. Something needs to be tweaked. (I'd even be willing to contemplate making the name of the parameter of the common this._foo be foo, but that's potentially breaking if it's referenced as _foo later in the parameter list.)

I'm now OK with making the constructor const if possible (superclass constructor is const, any further fields in a class declaration are non-late and final - and therefore necessarily initialized with a constant.)

Biggest issue: Do we need a way to specify a super-constructor other than the unnamed one?

If we allow Foo.name(int x) for a primary constructor, we'd also want to be able to call that from a subclass primary constructor. Maybe we can heuristically say that the primary constructor calls the superclass constructor:

which is a primary constructor, if the superclass has one
otherwise the one with the same name (empty/new name if unnamed), if it exists,
otherwise use the unnamed constructor, if it exists.

Most people will just use the unnamed primary constructor for everything, and that'll just work.

On second thought, there is one problem with implicitly inferring const for the constructor. (Other than getting people locked into it without them knowing it.) If the primary constructor is const when:

The superclass constructor is const, and
Default values of primary constructor parameters are const, and
Any fields added to the class supports being const (not late, final, and is nullable or has an initializer which is constant, even though its context isn't constant).

then whether the constructor actually is const will depend on very fragile and accidental choices.

Adding a field like final int x = 42; will preserve const-ness. Changing it to static final int _defaultX = 42; final int x = _defaultX; will make the constructor non-constant. (If _defaultX had been constant, it would work.) There is no warning, unless you check that the constructor can be used as const, which you might not care about (since you just broke it without noticing).

I think that's generally going to be too fragile. I'd recommend you having to write const to get a const constructor, say:

const class Foo(int x, int y);

That's an explicit opt-in to the primary constructor being constant. It makes it easy to give errors if some other part of the class doesn't support being const, rather than just silently not being constant. If you don't think about const-ness, you won't accidentally promise that the class can be used for constants.

Yes, it's one more word, and it'll likely be used a lot, but as long as we don't have const-by-default everywhere else, and an opt-out word for non-const-ness, I think we have to stick to the rule that const is not implicit, because it's a big promise you make in your API.

leafpetersen commented 2 years ago

I think that's generally going to be too fragile. I'd recommend you having to write const to get a const constructor, say:
const class Foo(int x, int y);
That's an explicit opt-in to the primary constructor being constant. It makes it easy to give errors if some other part of the class doesn't support being const, rather than just silently not being constant. If you don't think about const-ness, you won't accidentally promise that the class can be used for constants.

Yeah, I think I agree that it's too fragile, and I'd be fine with this choice (to put const before the class). For structs/data classes (if we do them) perhaps it would be reasonable to say that data class means both immutable and const though?

lrhn commented 2 years ago

Data classes/structs can be constant if their superclass is constant (and the superclass must be an abstract struct or the Object (or Struct, if we have it) class, so that should hold inductively), and if their primary constructor initializer/default-value expressions are constant (or at least potentially constant).

The current proposal allows non-potentially-constant initializer expressions. That means that

struct Foo({List<int> indices = [0]});

cannot have a constant constructor.

Again it becomes fragile to infer const for the constructor, because a slip of the hand, like writing = [] instead of = const[], will turn the struct from constant to non-constant without any real warning.

If initializer expressions have to be constant for structs, then we make all structs const, but I think that'll be too restrictive. There will be uses for structs that have mutable default values.

I mentioned earlier that we could have separate syntaxes for constant default values and non-constant initializers, say:

struct Foo({int x = 0, List<int> l ??= <int>[0]});

That would allow a struct with only = default values to be implicitly const, and using ??= being the way to opt out. You still don't have a way to opt out of providing a const constructor other than introducing a ??= initializer.

I'd still prefer to go with const struct Foo(int x, int y) to make the primary constructor const, rather than making it implicit.

eernstg commented 1 year ago

Cf. https://github.com/dart-lang/language/pull/3023, a concrete proposal about this feature.

Hixie commented 1 year ago

I'm very skeptical about the value of this feature. It adds a redundant syntax and no expressiveness. My intuition suggests that is not going to help with readability. I could be wrong though.

I think if we want to go down this path we should start with usability testing to ensure that it is a clear win for readability and maintainability, even in the kind of cases where the syntax is mixed with other syntaxes. Can people accurately determine that some syntaxes are identical in meaning? Can they accurately determine how to extend a particular case to add a new getter or method? Do they understand how this syntax works with superclasses, interfaces, or mixins?

eernstg commented 1 year ago

@Hixie wrote:

.. skeptical about the value of this feature. It adds a redundant syntax and no expressiveness.

That's true, it is simply an abbreviated way to write something which is already expressible. Note that super-parameters have exactly the same nature, and so does => function syntax, and this.x as a constructor parameter, and more.

In any case, input from usability testing would be very useful!

Can they accurately determine how to extend a particular case to add a new getter or method? Do they understand how this syntax works with superclasses, interfaces, or mixins?

Great questions! They should be part of the usability testing.

I'll comment on them here as well. There is not much of a connection to getter and method declarations, they are not affected by the primary constructor proposal. But the interaction with all kinds of superinterfaces, and with any other Dart feature, follow from the meaning of the primary constructor.

One way to understand what a primary constructor means is by desugaring: Replace the primary constructor syntax (something like const D.name(...)) by the class name (just D), and use it to create a new constructor in the body (const D.name(...);). For each parameter which isn't this. or super., declare an instance variable with the same name x and the given type, and change the parameter to this.x. Modifiers final and covariant are used when declaring that instance variable and removed from the constructor parameter.

Everything else follows. For instance, if the desugaring introduced an instance variable, say, final int x;, then this declaration can override a getter int get x => 5; in the superclass, or implement int get x from a class that it implements. Nothing new, nothing surprising, we just need to know that the given primary constructor introduces that instance variable.

// This:
class D._(int x, {final int y = 0}) extends A with M implements B, C;

// .. means this:
class D extends A with M implements B, C {
  int x;
  final int y;
  D._(this.x, {this.y = 0});
}

I tend to think that it's a significant abbreviation, and not that hard to read. But that assessment would of course be tested in a crucial manner by some usability studies.

mit-mit commented 1 year ago

I'm very skeptical about the value of this feature. It adds a redundant syntax and no expressiveness.

@Hixie I can understand that, but the stance seems inconsistent with the immense demand for https://github.com/dart-lang/language/issues/314

Hixie commented 1 year ago

I think we should address problems like #314 using a more general mechanism like macros.

That's true, it is simply an abbreviated way to write something which is already expressible. Note that super-parameters have exactly the same nature, and so does => function syntax, and this.x as a constructor parameter, and more.

I think the main difference between those and this proposal is that the delta between the two forms is much less disruptive and more intuitive.

For example, you can change one argument from this.x to Foo x without affecting the others. You can change one super.foo named parameter to a a non-super one without affecting the others. When you replace the body of a method, there's no other feature you might accidentally run into that changes the result. The code remains roughly in the same place in all these cases.

On the other hand, with this primary constructor syntax any number of things make a radical difference to the code:

adding a body to the constructor
adding an assert to the constructor
changing how one of the constructor arguments is handled (e.g. taking a string and parsing it into an int to set a field)
changing how the constructor arguments map to superclass constructor arguments
having the type of the argument not match the type of the field (e.g. argument is int, field is num)

While I believe we should use a feature like macros to address this use case, in the event that we don't, I think it might be possible to find a design that doesn't suffer from the problems I describe above. For example, we could add a syntax to existing constructors that declares the field for you:

class Foo {
  const Foo(int field.x, int field.y);
}

This would have the properties of this.foo and super.foo, like allowing mixing and matching:

class Foo {
  Foo(int field.x, int field.y, double r) : assert(r > 0), d = r * 2 {
    // ...
  }

  final double d;
}

eernstg commented 1 year ago

@Hixie wrote:

with this primary constructor syntax any number of things make a radical difference to the code:

adding a body to the constructor

So the issue here is that seemingly small changes to a given constructor would be syntactically small changes as long as that constructor is a body constructor, but they are suddenly much more disruptive when the constructor is a primary constructor.

I like the idea that @leafpetersen came up with, which is mentioned in the discussion part of the spec proposal.

In short, you can use all the parameter list features of a primary constructor in a body constructor by adding var as the very first token. It is an error to have two or more var constructors, and it is an error to have a var constructor if there is a primary constructor.

// Using a primary constructor.
class Foo(int x, int y);

// If we want to change the constructor beyond what we can do with a primary
// constructor, prepare by making it a `var` constructor:
class Foo {
  var Foo(int x, int y);
}

// Then we can add new elements like any other constructor.
class Foo {
  final int z;
  var Foo(int x, int y) : z = x + y, assert(x != y), super() {
    ... // We can have a body.
  }
}

So the idea is that the var constructor makes it less disruptive to perform all those changes. You could say that it is confusing that some instance variables are declared by that var constructor, but it would probably help a lot if we recommend a style where a var constructor (if any) occurs textually together with the instance variable declarations.

Another approach could be to have a quick fix in IDEs to change a class from using a primary constructor to a class which is equivalent, but uses a regular constructor and declares all variables in the normal way.

adding an assert to the constructor

This would be covered by var constructors and by the quick fix.

changing how one of the constructor arguments is handled (e.g. taking a string and parsing it into an int to set a field)

Sounds like this would be expressed using an element in the initializer list, which would be covered in the same way.

changing how the constructor arguments map to superclass constructor arguments having the type of the argument not match the type of the field (e.g. argument is int, field is num)

OK, so the primary constructor argument has type int, and it is a super-parameter, and it is passed to an initializing formal of type num. That's covered by primary constructors already:

// Today:

class A {
  num x;
  A(this.x);
}

class B extends A {
  B(int super.x);
}

// With primary constructors:

class A(num x);
class B(int super.x) extends A;

While I believe we should use a feature like macros to address this use case

That might be possible, too. It would probably be somewhat more verbose.

@jakemac53, what do you think about doing primary constructors as a macro? How would it look?

jakemac53 commented 1 year ago

@jakemac53, what do you think about doing primary constructors as a macro? How would it look?

There is an example data class macro here, the usage looks like this. Basically I had you just define the fields normally and then generated a constructor to initialize them (it would suffer the same issues as described for primary constructors).

However you could also write a macro that worked based on a manually written constructor, where the parameters describe the fields that should be added, and we do allow you to fill in initializers and/or a constructor body, or even augment existing initializers and/or constructor bodies.

eernstg commented 1 year ago

Thanks, @jakemac53!

Wdestroier commented 1 year ago

Could it be possible to allow the IDE to convert from a primary constructor to a normal constructor? Such as CTRL + . >> Move constructor to class body.

Could it be possible to allow users to replace macro annotations with their desugared form? Like the "delombok" tool (https://projectlombok.org/features/Data & https://projectlombok.org/features/delombok).

Which macro annotations will exist for sure in the Dart SDK? @allArgsConstructor or @data? @json? @toString, @hashCode, @toStringAndHashCode? @with?, @namedConstructor or @positional?, @record, @immutable.

leafpetersen commented 1 year ago

I think if we want to go down this path we should start with usability testing to ensure that it is a clear win for readability and maintainability, even in the kind of cases where the syntax is mixed with other syntaxes. Can people accurately determine that some syntaxes are identical in meaning? Can they accurately determine how to extend a particular case to add a new getter or method? Do they understand how this syntax works with superclasses, interfaces, or mixins?

@inmatrix Thoughts on this?

InMatrix commented 1 year ago

It's a long thread and I'm afraid I'm not able to get up to speed right now. If you'd like to discuss the potential UX questions and which of them usability testing can and cannot answer, maybe we can have a meeting? It would be helpful if each side of the argument can includes a few examples.

eernstg commented 1 year ago

A meeting about the usability related questions would be great! I've attached a bunch of examples of concrete syntax using this feature specification proposal, see examples-primary-constructor.txt.

The examples can be used to browse around and see how various combinations of features would look. The examples may serve as a raw material for arguments in any direction (e.g., that it is delightfully concise, or that it is impossible to read ;-).

To find examples involving specific features: Look at the very first examples in order to see some simple cases; search for TypeVariable in order to see examples where the class has type variables, search for implements in order to see examples where the class implements some other classes; search for [ to see examples where an optional positional parameter is used; search for Point._ in order to see examples where the primary constructor is named; etc.etc.

munificent commented 1 year ago

I did some scraping to see how well @eernstg's primary constructor proposal (#3023) would work in practice. I looked the 2,000 most recent pub packages (~12MLOC).

Of the 81,968 that declare at least one generative constructor:

-- Could have primary (81968 total) --
  63581 ( 77.568%): Yes  ===========================================
  18387 ( 22.432%): No   =============

So more than 3/4 of classes with generative constructors could benefit from this feature.

Looking at all classes:

-- Class (117300 total) --
  61317 ( 52.274%): Only generative constructor can be primary      =======
  35332 ( 30.121%): No generative constructors                      ====
  18387 ( 15.675%): None of generative constructors can be primary  ==
   2264 (  1.930%): One of generative constructors can be primary   =

So a little over half of classes declare a single generative constructor which could be a primary constructor if we had this feature. A very small number of classes have multiple constructors, one of which could be primary. So it's only marginally useful to support having both a primary constructor and other body constructors (but I still think we should support that).

The fact that we have super. parameters and would allow them in primary constructors significantly increases the usefulness of the feature:

-- Has super init (63883 total) --
  44592 ( 69.803%): No   =======================================
  19291 ( 30.197%): Yes  =================

Of the constructors that could be primary constructors in this analysis, about a third of them wouldn't be if we didn't have super..

The reasons why a particular constructor can't be primary:

-- Reason (23298 total) --
   3717 ( 15.954%): initializer, parameter
   3296 ( 14.147%): superclass argument
   3283 ( 14.091%): body
   2896 ( 12.430%): body, parameter
   1947 (  8.357%): named superclass constructor
   1412 (  6.061%): initializer
   1314 (  5.640%): parameter, superclass argument
    718 (  3.082%): assert initializer
    658 (  2.824%): body, named superclass constructor
    640 (  2.747%): body, initializer, parameter
    636 (  2.730%): parameter, redirecting constructor
    296 (  1.270%): initializer, parameter, superclass argument
    286 (  1.228%): parameter
    278 (  1.193%): body, initializer, named superclass constructor, parameter
    264 (  1.133%): named superclass constructor, superclass argument
more...

Basically all over the place. But most often it seems to be because there's a constructor parameter that doesn't directly map to a field or super initializer, or because there's a constructor initializer that isn't just assigned from a field.

Overall, this proposal look like a slam dunk to me. Note that the numbers here without any special handling for initializing private fields with named parameters (#2509). This is just following the proposal in its current state.

Hixie commented 1 year ago

Just because a feature could be applied to some code, or even a lot of code, doesn't mean it's a valuable feature. I don't think anyone is arguing this feature couldn't apply broadly. To summarize my concerns as described in earlier comments:

it is a syntax sugar that introduces a cliff where any slight deviation from the supported set requires a disproportionate amount of effort.
we have not demonstrated that this syntax would improve readability. It reduces the simplicity of the language by having multiple ways to do something.

I think my concern stems primarily from the way that we're stuffing the constructor into an unrelated part of the class declaration. The only win seems to be avoiding the {} braces and the duplication of the class name.

(FWIW, Kotlin has a very similar syntax which, in my somewhat limited experience playing with Kotlin, has exactly the same problems I'm worried about here: as soon as you try to do anything outside their limited supported set of features, you have to switch to completely different syntax to get it done. I find this concerning because it means developers have to learn two sets of syntaxes, plus understand the limitations of each, plus understand the magical implications of each (e.g. const implying the fields are final)...)

Instead of:

class Point(int x, int y);

...why not:

class Point {
  Point(int field.x, int field.y);
}

...where field, in a manner similar to super and this, means "declare a field for this parameter"?

class Point<TypeVariable extends Bound> extends A with M implements B, C {
  const Point(final int field.x, {required final int field.y});
}

class Point {
  Point(final int field.x, final int field.y) : assert(math.max(x, y) > math.min(x, y) * 2);
}

This syntax solves all the same problems except one duplication of the constructor name (which seems like a rather minimal issue -- that said, if we had virtual constructors then we'd want to have a way to declare constructors without using the class name anyway, probably), and doesn't have any of the problems of the proposed syntax, because it's strictly a superset of the current features.

jodinathan commented 1 year ago

@Hixie the problem with that syntax is that is very hard to read.

I've had a lot of trouble with TS classes when I needed to a find a property declaration and it was in the constructor.
A constructor is basically a function with arguments. My eyes just ditch constructors for properties declaration all of the times.

The proposal way is much better because you can easily differentiate full properties from sugar properties.

Hixie commented 1 year ago

I don't understand. The proposal here makes no distinction at all. What are the properties in this example?

class const Point<TypeVariable extends Bound>(this.x, {required int y})
    extends A with M implements B, C {
  final int x;
}

lrhn commented 1 year ago

The only win seems to be avoiding the {} braces and the duplication of the class name.

And the triplication of the field name.

I agree that this syntax is probably not a good match for complicated classes which need a mix of primary constructor and class body declared fields, and multiple constructors.

However, it may really shine for simple data classes, and the ability to quickly recognize a class as such could increase readability, by removing boilerplate.

class Point(final int x, final int y);
class ColorPoint(super.x, super.y, final Color color) extends Point;

That's probably also the optimal use-case, it won't get better than that.

Can the feature be used in a way that makes things harder to read? Sure. But it can also be used to make things easier to read.

So it requires judgement, but you can write unreadable code in any language.

Some people will want an automated lint that tells them whether to use a primary constructor or not for any given class, to enforce complete uniformity and remove users need to make style judgements. That's their choice, and not one I'll personally want to be subject to.

eernstg commented 1 year ago

@Hixie, we do have an effect along these lines:

it is a syntax sugar that introduces a cliff where any slight deviation from the supported set requires a disproportionate amount of effort.

However, we could have an analyzer refactoring to transform a primary constructor declaration into the corresponding non-primary constructor declaration, and another one for the opposite direction. This would at least help IDE users when they need to perform this kind of transformation.

So how complex would these refactorings be? We could have a declaration like C(int i) : this.i = i; which needs to be rewritten as C(this.i); first, but that could be a different refactoring or a quick fix. The refactorings to and from primary constructors would presumably only need to recognize a very tightly determined set of syntactic forms, and that should make them rather straightforward to implement ('to-primary' would need to recognize a constructor with no body and no initializer list whose parameters start with this. or super.; 'from-primary' would just handle the primary constructor declaration itself, no exceptions). We could have a few corner cases, but we should be able to sort that out:

class C {
  int i = 0;
  C([this.i = 1]); // 'to-primary' refactoring would fail.
  C.named(): i = 2;
  C.otherName();
}

we have not demonstrated that this syntax would improve readability. It reduces the simplicity of the language by having multiple ways to do something.

True again. However, you could say that "use a primary constructor" and "use a normal (non-primary) constructor" is a small and fixed set of variants. As soon as you have both possibilities in mind, you can work on thousands of class declarations of any complexity, and and the number of variants in each case will still be this fixed set of size 2. So it's "a problem with complexity O(1) for a program whose complexity is N." ;-)

In other words, I expect that each of us would get used to this fact very quickly, and then it won't be a problem any more, ever.

Compare this with the function declaration syntax: Block bodies and expression bodies (() {...} vs. () => e) provides two different ways to say the same thing as well, but we can live with that.

munificent commented 1 year ago

it is a syntax sugar that introduces a cliff where any slight deviation from the supported set requires a disproportionate amount of effort.

This has long been my biggest concern with the feature too. Automated tooling can definitely help, but it's still a worry.

However, the other way to look at this is that the reason the effort is so disproportionate is because the sugar is so effective when it does work. The code you have to write to turn a primary constructor into an explicit one and a set of field declarations is exactly the amount of code its saving you when the feature does apply. That saving is significant.

we have not demonstrated that this syntax would improve readability.

To be fair, we never formally demonstrated that many of Dart's syntax changes improve readability. I can't point to any formal UX studies that show that cascades, control flow elements, this. parameters, ?. null-aware operators, etc. improve readability. We believe they do, and we observe after the fact that users use them, and the sentiment of the language is going up, so presumably we're on the right track. But I can't prove it.

We rely on the taste and judgement of the team, and the feedback we get from users to inform these choices. We also get a lot of information by observing what other languages do and seeing how users respond to it.

I do love getting actual usability research data when we can, too, but doing that is challenging, costly, and can be hard to slot into our schedule.

In this case, many other languages have primary constructors and I haven't heard many user reports that they regret or dislike that feature. And we know that primary constructors or some other way to make simple stateful classes easier to write has been a strong feature request.

It reduces the simplicity of the language by having multiple ways to do something.

Yes. :-/

In a perfect world, the language would have only a single, terse, orthogonal, composable way to solve every problem.

In practice, it's really hard to design syntax that scales both up and down equally well, and we are continually learning about how our users want to read and write code.

If Dart users never wrote functions with more than two or three arguments, we'd probably have never added named parameters. Named parameters are another more verbose, more complex, redundant way to pass values to functions. But they are a better way when the parameter list gets long or when you have many optional parameters. Positional parameters don't scale up well.

We could only do named parameters and require users to always write argument names. That would be simpler while still scaling up well. But it would be punishingly verbose to have to write list.add(element: thing) and someString.startsWith(needle: 'haystack'). Named arguments don't scale down well.

We compromise by sometimes having a few ways of doing things that are tailored to different use cases. We try to have as few as possible and make them cover as much space as we can but... language design is hard.

Flutter has a Container class for similar reasons. It's strictly redundant, but the brevity it provides justifies the extra complexity.

...why not:
class Point {
  Point(int field.x, int field.y);
}
...where field, in a manner similar to super and this, means "declare a field for this parameter"?

This syntax solves all the same problems except one duplication of the constructor name (which seems like a rather minimal issue

I'd have to dig around to find it, but I've proposed the exact same thing a couple of times. You're right that it avoids the cliff because you are writing a constructor. I do personally kind of like. But I've never gotten anyone else enthusiastic about it.

But the duplication of the constructor name ends up not being minimal in the use cases where this feature really shines like the ones @lrhn points out:

class Point(final int x, final int y);
class ColorPoint(super.x, super.y, final Color color) extends Point;

// Versus:
class Point { Point(final int x, final int y); }
class ColorPoint extends Point { ColorPoint(super.x, super.y, final Color color); }

One way to think about it is that if we're deliberately evaluating a feature that helps when code scales down, it should do the best job when the code scales all the way down. Primary constructors do that: The simplest classes that use them are about as few tokens as you could possibly get while specifying the same information.

Something like what you propose here scales kinda down but then still ends up feeling a little redundant and verbose. But if it's going to be redundant and verbose anyway, you may as well just do the whole thing and declare the fields. As syntactic sugar, it never looks great just "OK" for a potentially wider range of cases. If we're going to add some complexity and redundancy to the language, we may as well make the result as beautiful as we can get it.

It's annoying when you hit the cliff with a primary constructor and have to stop using it. But when you don't hit that cliff and can use it, it's really minimal.

And, pragmatically, we have an existence proof from other languages that it's a workable feature.

munificent commented 1 year ago

@lrhn asked an interesting follow-up question about my analysis for how often a potential primary constructor initializes all of the instance fields the class declares.

It might be more confusing to use a primary constructor with a class that also explicitly declares some instance fields.

The answer is:

-- Fields (63883 total) --
  38769 ( 60.688%): Initialized all fields   ======================
  12016 ( 18.809%): No fields to initialize  =======
   4312 (  6.750%): 2 / 3                    ===
   1638 (  2.564%): 0 / 1                    =
    764 (  1.196%): 0 / 2                    =
    679 (  1.063%): 1 / 2                    =
more...

So, yes, most of the the time the primary constructor will initialize all of the fields.

(You may wonder why there are so many primary constructors that don't initialize anything. Usually those are calling a superclass constructor (usually super(key: key)), or wanting to give a name to the class's constructor.

When a potential primary constructor does initialize all fields, it's most often just a single field, but may be more:

-- All initialized field count (38769 total) --
  16478 ( 42.503%): 1    ========================
   7332 ( 18.912%): 2    ===========
   5238 ( 13.511%): 3    ========
   2348 (  6.056%): 4    ====
   1633 (  4.212%): 5    ===
   1210 (  3.121%): 6    ==
    831 (  2.143%): 7    ==
    693 (  1.788%): 8    =
    514 (  1.326%): 9    =
more...

Hixie commented 1 year ago

In other words, I expect that each of us would get used to this fact very quickly, and then it won't be a problem any more, ever.

For any individual feature, this is true. But these features aren't used in isolation, and all the other features in the language add together. We have a finite amount of complexity we can introduce to Dart. Do we want to spend it on this?

Compare this with the function declaration syntax: Block bodies and expression bodies (() {...} vs. () => e) provides two different ways to say the same thing as well, but we can live with that

I wasn't around when Dart added =>, but I would have been equally skeptical. We are paying an ongoing cost for having that syntactic sugar. It's a mistake we can't undo. (I was around when we made switch have two different syntaxes based on whether it's a statement or an expression, and I think that's a mistake also, for the same reason, as I pointed out at the time.)

I don't think our bar should be "we can live with that". I think our bar should be "this will contribute towards why Dart is still the easiest language for people to use in 30 years". Every time we add redundant syntaxes, we walk further away from that.

To be fair, we never formally demonstrated that many of Dart's syntax changes improve readability.

I am painfully aware. :-) I think this is a huge mistake.

Flutter has a Container class for similar reasons. It's strictly redundant, but the brevity it provides justifies the extra complexity.

Container was a mistake, and if I were to start over, I would not include it. The complexity it has introduced has not in fact been justified, IMHO. Even people with years of Flutter experience get tripped up by Container (see e.g. the recent discussion in Google's internal Flutter hackers chat).

And, pragmatically, we have an existence proof from other languages that it's a workable feature

My reaction when seeing it in other languages (Kotlin, specifically) was "wow that's a mistake, it's a good thing Dart isn't making this mistake". (I even called this out explicitly in my 90-page document from 2020, search for "quickly worsens".)

Personally I think syntactic cliffs and redundancy are an order of magnitude worse a cost for a language to bear than verbosity.

FMorschel commented 1 year ago

I don't think our bar should be "we can live with that". I think our bar should be "this will contribute towards why Dart is still the easiest language for people to use in 30 years". Every time we add redundant syntaxes, we walk further away from that.

I disagree with that. Dart is not like Python with the focus on making it mainly simple to write and read, but actually focused on the community feedback and suggestions, so that is not true in all cases.

leafpetersen commented 1 year ago

@Hixie

I wasn't around when Dart added =>, but I would have been equally skeptical. We are paying an ongoing cost for having that syntactic sugar. It's a mistake we can't undo.

As written, I can't tell you how strongly I disagree with this, both as a user and as a designer. This was not a mistake. This is possibly one of the most popular features in the language. Do an internal search for \)\ \=\> and count the hits (be sure and "find everything"). Scrape some external code. Arrow bodied functions are hugely popular with our actual users. Supporting a simple compact syntax for lambdas was not a mistake. If I proposed to remove these and replace every => e with {return e;} our users would be, shall we say, less than happy.

Now, there's a related, but more supportable position you could take, which is that there is (perhaps) a better design which doesn't require a different syntax. That is, perhaps we could have a syntax which scales perfectly up from => e (or the equivalent). For example, there's some other discussion in this issue tracker about adding expression blocks, and perhaps you could imagine a world in which all functions are => functions, and block bodied functions become uses of an expression block? Maybe - there's a lot to work out in there, but stepping back, that general design direction is definitely one that I support: if you can come up with a single design that scales perfectly then that is almost certainly superior to having two different ways to do the same thing. But... that's really hard to do, and it's pretty much impossible to do when you already have a design in place that doesn't scale down and you have to work with what you have.

Personally I think syntactic cliffs and redundancy are an order of magnitude worse a cost for a language to bear than verbosity.

I don't agree with this, and in general, Dart is not the language for you if this is really your design philosophy. :) There is no need to have getters and setters in Dart: they are redundant with getX and setX methods (just ask Java programmers). There is no need to have binary operators: we already have methods. There is no need to have index operators, again, methods work fine. There is no need to have free functions - methods work fine. There's no need to have lambdas: just declare a class, works for Java. Switch statements are redundant with if statements, and there's a huge cliff to switching over. Loops (for, while, do) are all redundant with (and strictly less general than) simply providing proper tail recursion. For-in loops are completely and utterly redundant (and a huge syntactic cliff). Async functions: nothing but sugar.

There's a perfectly coherent (and beautiful!) design philosophy based around providing a single fully general mechanism for each thing, and nothing more. Lisp/Scheme are arguably one good example of this (just provide macros, so your users can write all of the things that you left out... :) ). Perhaps Smalltalk might be another? But for better or for worse, that is not the design philosophy of Dart, or of any currently widely used language that I'm aware of. Instead, we try to provide abstractions that scale perfectly, and when we can't, we see if we can provide another (less general) way of doing things that covers a large enough subset of the use cases beautifully to be worth doing. So we have getter/setters, which are great (but redundant). We have free functions, which are redundant with methods (but so much less boilerplate!). Etc, etc.

Now, to be clear! None of the above is an argument for primary constructors, or in general for any form of redundancy and syntactic cliffs. Redundancy, and syntactic cliffs, are each a serious cost, and we should most definitely weigh them (and they definitely weigh heavily on the team in our discussions). But it is an argument that the weighting that you give above ("an order of magnitude worse cost ... than verbosity") is not well aligned with the actual revealed preferences of our users. Our users do, in fact, quite value having brief syntax for common operations. So our job, here, is to weigh the benefits (how much brevity? how readable? how often does the sugar apply?) against the costs (how hard to refactor? how much more to learn? implementation complexity, etc).

Hixie commented 1 year ago

In general it's been my experience that usage of a feature is not a signal that the feature is a net benefit. Sometimes, the absence of a feature forces developers down a path that, on the long run, provides them with a better experience. This often manifests as people asking questions that seem absurd, but when you ask them how they got to that question, it's because they went down a path that was only available because a feature made it available, and had the feature not been there, they would have found their original problem easy to solve. (Nothing hammers this home more strongly than watching UX studies live in the lab.) The aforementioned Container is a great example of this.

I should elaborate on the last statement I made in my previous comment. There's more factors to consider than just syntactic cliffs and verbosity. Notably, I think removing verbosity by providing dedicated syntax that more directly reflects the developer's intent is a net win, even when it introduces a syntactic cliff. I think this is why having switch and if (and ?:) instead of just one of them is valuable. I think this is why getters and setters are valuable even though functions technically provide the same ability. I think it's why binary operators are a win over only exposing methods. I don't think it applies to => vs { return ; } and I don't think it applies to primary constructors as proposed here. You don't use a primary constructor if you mean one thing and a normal constructor otherwise. You don't use => if you mean one thing and { return ; } if you mean otherwise. You use them just to save characters.

To put it another way, => vs { return ; }, and single-quote-string vs double-quote-string, and primary constructors as proposed here, are all things that I would have dartfmt automatically use or not use based on whether it makes the code shorter or not, and IMHO we shouldn't have any features like that. They are redundant syntax that harm rather than improve readability, because they are two ways of saying the same thing (increasing cognitive load), not ways of expressing two different semantic things that happen to boil down to the same assembler (decreasing cognitive load).

leafpetersen commented 1 year ago

@Hixie

There's more factors to consider than just syntactic cliffs and verbosity.

Yes, agreed.

They are redundant syntax that harm rather than improve readability, because they are two ways of saying the same thing (increasing cognitive load), not ways of expressing two different semantic things that happen to boil down to the same assembler (decreasing cognitive load).

Note that this applies equally to your counter-proposal. It introduces two different ways of introducing fields. And my opinion is that the approach you propose is much more damaging to readability.

With primary constructors, there is a single place in the class which always, if present introduces new fields. And all of the declarations present there introduce new fields. And in many (most?) cases, all of the fields will be declared there. So in the common case, it makes the class more readable, and in other cases, it has minimal negative impact.
Contrast with declaring fields in arbitrary constructors. Now I have to look for all of the constructors in the class (which can appear anywhere in the class body) to figure out what fields the class has. Any given constructor has a bunch of declarations in it, some of which introduce fields, others of which don't.

So both proposals introduce redundancy, but the primary constructor proposal, is, in my judgement, strictly worse for readability in all cases, slightly more verbose and less readable in the common case, and dramatically less readable in the less common case of a large class with multiple constructors. In other words, the primary constructor syntax seems to me to both scale down better (it's better in the minimal case) and to scale up better (it doesn't interfere with readability in large classes with multiple constructors).

There is an additional factor, which is familiarity. Users coming from a number of other languages are already familiar with primary constructors. Novel syntax, on the other hand, always has a learning burden. This doesn't say we should never be novel - just that all other things being equal, familiar syntax is better syntax.

Hixie commented 1 year ago

this applies equally to your counter-proposal.

Yeah, I think I would prefer to not add either. The counter-proposal was just a proof-of-concept for a syntax that could more or less remove the same amount of verbosity without the cliff.

I agree that having multiple constructors use field. syntax would be a disaster. I retract that proposal even as a proof of concept. :-)

So in the common case, it makes the class more readable

I don't think this claim is substantiated. Intuitively, I would expect the opposite. I also expect people will use this a lot more than in just the common case, and that's the thing that really worries me about this syntax.

FWIW, studying the cases in examples-primary-constructor.txt, I suspect I would probably put this feature in the same bucket as extension and part of in the Flutter style guide, which is to say, avoid using. I could maybe see making an exception for cases where the primary constructor syntax results in the class having no body at all, where there's no implements or with, no generics, and all fields are final -- basically the simplest possible case, but those are pretty rare.

I still think solving this with macros is superior to solving it with dedicated syntax.

munificent commented 1 year ago

I really appreciate the discussion in this thread. It's always hard to evaluate syntactic sugar because, by definition, the feature exposes no new actual capabilities. Thus you're looking at learnability, usability and subjective beauty. Those are hard to weigh because, even ignoring the subjectivity, they are very different for different cohorts of users.

I need to write an essay about this somewhere, but I think we tend to have an overly simplistic model of user learnability. It's not that users have a fixed, finite, one-time budget of "learning units" that they spend a whole batch to learn a whole language in one go.

What I observe is that, instead, users have a recurring, incremental learning budget. Many users like learning new stuff, but there is a rate that they can comfortably learn. When they a approach a new language, the initial learning cost of the language before they can be productive determines how long before they get up to speed. A language with lower total complexity, but with an intertwined feature-set that requires users to learn all of it to be productive can be more alienating than an overall larger language that allows them to be productive while only knowing a subset.

Once a user has learned the whole language, many essentially have "free" learning budget available. Obviously, there are plenty of other things they can spend that on in life, but for many they'd be happy to have a steady pipeline of new language features that make them more productive. This means that, for existing users, there is essentially no ceiling for language complexity: as long as it grows as fast as they can absorb it, they're fine.

But for new users, that isn't the case, obviously. So we try to balance the complexity of the language by thinking about both of those cohorts, and also about how the language works when you only know some of it. I don't think it works well to lump those cohorts together or to think of language complexity as a fixed-one-time cost that is fully paid.

Also, new users are often coming from other languages and carry what they already know from those into a new language. In the limit, adding a feature to a language can lower the learning cost if most other languages have the same feature and users don't have to learn how to work around its absence. (For example, adding enums to Dart meant that users coming from C, C++, Java, and C# didn't have to laboriously learn the "enum class pattern".)

Treating learnability as progressive, incremental, holistic, and experienced differently by different cohorts of users is the best way I know to explain what we observe about the evolution of most languages in the wild (primarily that they keep getting bigger while everyone says they like them small).

I agree that simplicity is important and we strive for it as much as we can. But beauty and fitness for task are important too, and often at odds with that. A singular one-size-fits-all feature often does a mediocre job across all tasks. With syntactic sugar in particular, my experience has been that the learnability cost is much lower than learning new semantics (like library privacy or mixins) and the value in terms of expressiveness and subjective appeal can be high.

If we took away int-to-double, =>, this. parameters, async/await, implicit new and const, and extension methods, the language would be as expressive as it is today, but I don't think we'd have many users.

scheglov commented 1 year ago

+1 to using macros for this. With a macro that generates constructors, we can write fields as is, no new syntax, and we can add documentation for each fields, if necessary. I also like when code is formatted nicely, and when each field is on its own line, gives better formatting, no doubts, whether the list of formal parameters as fields will wrap, how. We can make fields final, or not.

Cat-sushi commented 1 year ago

I think, the obligatory purpose of primary constructors is to make simple things simpler. I'm afraid, macro is not so simple as primary constructor, as @eernstg said.

While I believe we should use a feature like macros to address this use case

That might be possible, too. It would probably be somewhat more verbose.

jakemac53 commented 1 year ago

A macro would be more verbose but it would still eliminate the duplication of field names (in their declaration and the constructor), which is the primary brevity win here.

One additional advantage of primary constructors though is the ability to easily describe the shape of the constructor (which fields are positional, which are named, and which are required). Doing this in a macro is possible but likely requires additional metadata on each field. Or, you end up with a macro that annotates a constructor and generates the fields.

Macros have the advantage of both of these versions being allowed to co-exist, and being an opt-in only feature. It is also less stuff we have to own and implement ourselves. But then of course the possible disadvantage of having a more fragmented ecosystem, where you have to learn each macro. Although, if we do it right macros are likely easier to learn than new language features, because you can go to the definition and it should have good documentation of exactly what it does (plus you can just see exactly the generated code). It is harder to search the language docs for a given piece of syntax that you don't understand.

xvrh commented 1 year ago

To make the previous comment of @jakemac53 it a bit more visual, here is what it could look like:

Macro:

import 'package:constructor_macro/constructor_macro.dart';

@GenerateConstructor(positionalFields: true)
class Point {
  final int x;
  final int y;
  @named final int? z;
}

// and

class Point {
  @GenerateFields()
  Point(final int x, final int y, {final int? z});
}

(this is similar to Lombok https://projectlombok.org/features/constructor in Java)

Hixie commented 1 year ago

@munificent While I agree with what you said, I think I come at this from a different perspective. If we assume Dart will be a very successful language over the long term, then to a first approximation, all of our users will be new users, and for all of our new users it will be the first language they learn.

I also tend to think that users are exposed to features a lot quicker than we would like, preventing us from giving people a nice ramp-up to more complicated features, because they will often onboard with existing code (e.g. joining a team that has a codebase already, or using third-party dependencies, etc).

If we took away int-to-double, =>, this. parameters, async/await, implicit new and const, and extension methods, the language would be as expressive as it is today, but I don't think we'd have many users.

int-to-double costs very little because it's what people assume will happen anyway (that's why we added it, based on UX research showing people were getting confused when we didn't have it).

=> we discussed above, I am not convinced it's a net positive to the language.

this. parameters (and super. parameters) are an interesting case. I'd be curious to study them in a controlled setting to see whether they were a net benefit or not. I'm not sure. My intuition (which doesn't count for much vs UXR) is that they are relatively intuitive and don't have a high cognitive burden.

async/await are certainly a source of confusion but I think the confusion really comes from asynchronous programming in general and I expect the syntactic sugar here is a net win over using explicit Future APIs. I do think there is room for improvement here. For example, we could offer a mode where we show what they are equivalent to, similar to how we might show macro expansions. (FWIW, I expect FutureOf is a bigger source of confusion.)

implicit new doesn't have a cognitive load, it removes complexity. We should just remove explicit new.

implicit const has a minimal cognitive load once you have only implicit new, because it just becomes this annotation you can add that means the subtree of the AST from that point must be const, and then it just becomes a lint that you can remove redundant annotations, like with type inference.

extension methods I would remove. I think they are a mistake in any language. They are banned in Flutter's style guide.

I don't think we would lose users if we removed =>, explicit new, and extension methods; I think long term we would gain more. I think the same applies to the proposal in this issue.

Macros are, of course, a much bigger source of cognitive load than any of the above. They are, if we execute them well, an improvement over codegen which is what people end up using anyway if we don't have them, and they can be a big win if they enable transparent syntactic sugar (allowing developers to see exactly what the macros does). They are a very risky play though, as I've said before.

jodinathan commented 1 year ago

@Hixie Flutter's style guide doesn't banish == but it doesn't mean it is a good thing (it is not).

Extension methods are very good in the HTML world as we don't have access to the original class.
This may change with the upcoming inline classes, thought.

=> is useful in simple cases.
I don't like having to type return all the time, I even wish I could replace return with =>, ie func() { foo(); bar(); => daz(); }

IMO I don't like your suggestions as it seems Dart would become old Java

sigmundch commented 1 year ago

implicit new doesn't have a cognitive load, it removes complexity...

Depends on who you ask :) - My reservation with implicit new is that, just like with type inference, we are now using transitive knowledge to understand the code we read. That hurts the user experience outside the IDE environment (think plain text editors, PRs, gerrit). In practice, for new, it didn't matter because of our strong style conventions - a capitalized call is not guaranteed to be a constructor call, but in practice, it is.

extension methods I would remove. I think they are a mistake in any language. They are banned in Flutter's style guide.

I'm surprised they are banned! Like @jodinathan said, these are essential in JS interop. They may look less like extension methods in the future with inline-classes, but the semantics for inline classes closely aligns with that of extension methods.

Cat-sushi commented 1 year ago

This verbosity below would kill the benefit of primary constructors.

import 'package:constructor_macro/constructor_macro.dart';

@GenerateConstructor(positionalFields: true)
class Point {
  final int x;
  final int y;
  @named final int? z;
}

// and

class Point {
  @GenerateFields()
  Point(final int x, final int y, {final int? z});
}

I would like to define

inline class implicit Weight(double _) implements double;

instead of

typedef Weight = double;

which doesn't introduce type checks.

Cat-sushi commented 1 year ago

I don't have a strong opinion that all kinds of classes should have primary constructors. But, I think that verbose primary constructor like syntaxes are very nonsense. I think also that, when data classes introduce primary constructors, then there is no additional cognitive load for primary constructors on other classes.

Do you @Hixie want to reject primary constructors on data classes, as well?