Create a variant of omit_local_variable_types that permits non-trivial declared types

eernstg commented 2 years ago

[Edit Aug 2024: Changed the title: New lints are introduced, omit_local_variable_types is not modified.]

This issue is a proposal that omit_local_variable_types is mildened such that it only flags local variable declarations with a declared type in the case where the variable has an initializing expression with a type which is not 'obvious' (as defined below).

This is a non-breaking change, because it makes omit_local_variable_types flag a smaller set of situations.

It should still flag cases like int i = 1; (which should be var i = 1;), List<int> xs = [1, 2, 3]; (which should be var xs = [1, 2, 3];), C<num> c = C(1.5); where C is a class (it should be var c = C<num>(1.5);), C<double> c = C(1.5) (which should be var c = C(1.5), assuming that it infers as C<double>(1.5)),etc, because the initializing expression has an 'obvious' type.

But it should allow Foo<int> foo = someFunction(a, b()); even though that might also be expressible as var foo = someFunction<Whatever, It, Takes, To, Make, It, A, Foo_int>(a, b<Perhaps, More, Stuff>());. It should actually also allow it even in the case where the same thing is expressible as var foo = someFunction(a, b());, because the type of the initializing expression isn't obvious.

The motivation for doing this is the following:

The declared type of a local variable may be considered to be a simple, self-documenting, declarative approach to determine which actual type arguments to provide at various locations in the initializing expression. If, alternatively, we pass some actual type arguments in the initializing expression, we might infer the same declared type, but (1) it is not documenting the type of the variable (which means that it may be hard to understand how to use it correctly), (2) it is not simple (because we may need to look up various declarations in order to determine the type of the initializing expression), and (3) it is not declarative (because we provide feed-forward information in the initializing expression, rather than specifying the desired end result and relying on type inference to sort out the details). For example:

// Some other libraries contain various declarations that we're importing and using.
// The following declarations play that role.

Map<X, List<X>> f<X>(A<X> a) {
  var ax = a.x;
  return ax != null ? {ax: []} : {};
}

X g<X>(Map<X, List<X>> map) => map.keys.first;

class A<X> {
  X? x;
  A(this.x);
}

// We are using the above declarations.

void main() {
  // We can omit the variable type, because it is inferred.
  var v1 = g<int>(f(A(null)));

  // We can also declare the variable type.
  int v2 = g<int>(f(A(null)));
  ...
}

The lint omit_local_variable_types will flag v2, because the declared type is seen as unnecessary. The perspective that motivates this issue is that it is indeed possible to infer exactly the same declared type, but it may still be helpful for all of us to have that declared type when we're reading the code. In particular, we'll at least need to look up the declaration of g before we can even start guesstimating the type of v1. So the proposal here is to make omit_local_variable_types stop linting these non-obvious cases, thus allowing developers to document the type where it's actually needed.

(Interestingly, omit_local_variable_types does not flag int v2 = g(f(A(null)));, which seems to contradict the documentation of the lint, so maybe the lint is already doing some of the things which are proposed here. It clearly does not flag every local variable declaration that has a declared type which is also the inferred type of the initializing expression).

Note that it is assumed that omit_local_variable_types will never flag a declaration T v = e; where the declared type T differs from the type of e which is inferred with an empty context type, or where the type inference on e with an empty context type produces a different result (different actual type arguments, anywhere in e) than type inference on e with the context type T. This proposal is not intended to change anything about that, and it should not affect any of the issues where it is reported that omit_local_variable_types has a false positive, because it lints a case where it does make a difference. This proposal is only about not shouting at people when they have a declared type that may arguably be helpful.

Obvious Expression Types

In order to have a mechanically decidable notion of what it takes for a type to be 'obvious' we need a precise definition. The point is that it should recognize a number of types that are indeed obvious, and then all other types are considered non-obvious.

This means that omit_local_variable_types will allow (not force) developers to declare the type of a local variable with an initializing expression e when e has a non-obvious type, but it will continue to flag T v = e; when the type of e is obvious.

An expression e has an obvious type in the following cases:

e is a non-collection literal. For instance, 1, true, 'Hello, $name!'.
e is a collection literal with actual type arguments. For instance, <int, bool>{}.
e is a list literal or a set literal where each element has an obvious type, and all elements have the same type. For instance, [1, 2], { [true, false], <bool>[] }, but not [1, 1.5].
e is a map literal where all key-value pair have a key with an obvious type and a value with an obvious type, and all keys have the same type, and all values have the same type. For instance, { #a: <int>[] }, but not {1: 1, 2: true}.
e is a record literal where every component has an obvious type. For instance, (1, enabled: true), but not (1, enabled: foo()).
e is a local variable that has not been promoted. *For instance, int i = 1; then var x = i;.
e is an instance creation expression whose class part is not raw. For instance C(14) if C is a non-generic class, or C<int>(14) if C accepts one type argument, but not C(14) if C accepts one or more type arguments.
e is a cascade whose target has an obvious type. For instance, 1..isEven..isEven has an obvious type because 1 has an obvious type.
e is a type cast. For instance, myComplexpression as int.
e is a conditional expression where both branches have an obvious type, and they have the same type. For instance, condition ? 'yes' : 'no'.
e is a type test. *For instance, myVariable is String.
e is a throw expression. For instance, throw myError.
e is a parenthesized expression where the contained expression has an obvious type.
e is this.
e is a type literal. For instance, String.

Discussion

We may need to add some more cases to the definition of an obvious type, because some other types may actually be obvious to a human observer. However, it is not a big problem that some types are obvious but are not recognized as such, because this means that developers have the freedom to declare a type when they want to do that. If the set of obvious types is enlarged later on, it will be a breaking change, but if the given type is actually obvious then there will be few locations where the new, more strict, version of omit_local_variable_types will flag a declaration, because few developers have actually declared the type in that case. So the changes that are actually breaking are also the ones that may be questionable, in which case the type might not be so obvious after all.

One outcome that we could hope for if omit_local_variable_types is mildened as proposed here is that this lint could become a broader standard (for example, it could be a member of the recommended set of lints), because a larger percentage of the community would be happy about the way it works.

Versions

1.0, Jun 29, 2022.
(Small adjustments over time, no detailed description.)
2.0, Aug 6, 2024: Adjust the rules about obviously typed expressions. Collections and maps must satisfy that all elements have an obvious type, not just some elements. A type cast (e as T) is now considered obviously typed.
2.1, Aug 7, 2024: Add a "converse" lint specify_nonobvious_local_variable_types, and broaden the notion of what it takes to be an expression that has an obvious type.

bwilkerson commented 2 years ago

I agree that the results of type inference can sometimes be difficult for a human reader to predict, and that this can significantly impact the user experience. And I agree that adding explicit type annotations is one way to solve that issue by removing the need for the reader to predict the results. My concern with this proposal is that this lint currently enforces the style guide, and I wouldn't want to change the lint without changing the style guide to match.

eernstg commented 2 years ago

@bwilkerson wrote:

this lint currently enforces the style guide, and I wouldn't want to change the lint without changing the style guide to match.

That would indeed be nice! The whole point with this issue is that it can be counterproductive to ban all declared types on local variables, because some of them are actually helpful (good for code readability and maintainability). I'm just asking that developers are allowed to use good judgment in this matter.

By the way, the current version of omit_local_variable_types does not lint every declaration of the form T v = e; where var v = e; yields T as the inferred type and doesn't change the semantics of e, it only lints some of them:

// Define `f`, `g`, and `A` as before.

int v2 = g(f(A(null))); // No lint.

I don't know exactly where the threshold is, but my proposal is simply to omit the lint in a larger set of cases, such that we only push developers towards var v = e; when the type of e is obvious. The proposed change will never push them in the opposite direction, so anyone who wants to use var v = e; where the type is completely beyond comprehension is free to do so.

I'm afraid we can't say that this is a bug, because the lint is implemented in a way that corresponds pretty well to the documentation and the style guide. So I'll remove the 'bug' label for now. But I'll be very happy if we decide to change the recommended approach slightly as proposed here, and in that case it will become an implementation request.

bwilkerson commented 2 years ago

Sounds good.

@munificent Just FYI.

munificent commented 2 years ago

I actually quite strongly prefer the current guidance in Effective Dart. I wouldn't support the lint deviating from it. My arguments are:

Billions of lines of correct code are written and maintained in dynamically typed languages where no variables are annotated. It can't be that necessary to allow redundant type annotations on local variables.
Most statically typed languages have or are moving to inferring locals. It has always been idiomatic in Scala, Kotlin, Go, Swift, and many other languages I'm not thinking of to omit the types on all locals. C#, Java, and even C++ (where type annotations actually change behavior) have moved decidedly towards it.
This has been the well established convention for many years in Dart with very few complaints aside from a small number of people (who seem to disproportionately be on the Dart team). All Google internal Dart code has been expected to omit type annotations for all correctly inferred local variables for years. It's a requirement to earn readability. This has never showed up as a major style complaint.
I have worked on several teams across multiple languages as they have gone from fully type annotated locals to type inferred locals. Every time, I have seen a number of users feel extremely uncomfortable with omitted types. They would eventually grudgingly accept omitting types on "obvious" initializers. Over time their definition of "obvious" would grow and grow until eventually, they get tired of remembering which initializers are obvious and just use it on all of them. I have never seen a team go backwards away from inferred locals towards more types.
My experience is that type annotations on locals tend to encourage worse names for local variables.
Scope of local variables almost always is and certainly should be small. If you can't understand a short function without redundantly type annotating a local variable, you likely have bigger readability problems and should focus on those.
Users will argue about the heuristics for which kind of names are "obvious" until the end of time. (This is why the annotation guideline for non-local variables is deliberately vague and leaves it up to individual taste.)
Trivial refactorings will often change an initializer from "obvious" to "non-obvious". For example, turning a named constructor into a static method with the same name should according to these rules require every local variable that calls that method in its initializer to be fixed to have a type annotation. Conversely, when refactoring a static method to a named constructor, this would require users to go back and remove all of the type annotations.
Allowing redundant type annotations on locals makes it harder to notice non-redundant type annotations. I think it's more valuable to have upcasts stand out visually.

I think we should just leave the lint and rule as they are.

eernstg commented 2 years ago

@munificent, this is of course a topic where we disagree substantially, so I'll have to comment on it. ;-)

Billions of lines of correct code are written and maintained in dynamically typed languages

But I wouldn't expect you to argue that we should drop all static typing? I thought an argument against this proposal should be something which is specific to local variables.

Most statically typed languages have or are moving to inferring locals

Sure, and I would certainly expect that to be the typical approach in Dart. I even think we should nudge people in that direction in as many cases as possible when the type doesn't provide useful information, that is, when the type is obvious in the initializing expression.

If you can't understand a short function without redundantly type annotating a local variable ..

So what's the type of var y = foo(x);?

No, I'm not going to tell you which type arguments foo accepts, and even the tools aren't going to tell you how type inference proceeded to infer types for that expression. The point is that you need to look it up, and foo is of course in some other package. (OK, you can hover in an IDE, but we do encounter code in other contexts as well.)

So the type is not redundant for a reader of the code, and I don't understand why we need to shout at people who prefer to inform readers about such non-obvious types, in some carefully selected particularly complex situations.

Users will argue about the heuristics for which kind of names are "obvious" until the end of time

Any user who wishes to omit the type of a local variable can freely do so because the lint will never ask for the type to be added, it will only ask to get the type removed (when it's obvious).

So the only case where anyone would want to change the heuristic is when they want to have a type annotation in a case where the lint says that the type is obvious.

I don't think that's very likely to happen, presumably the obvious types are obvious, and will remain so.

turning a named constructor into a static method with the same name should according to these rules require every local variable that calls that method in its initializer to be fixed to have a type annotation

No, we never require a type annotation, we just stop complaining about a type annotation which isn't obvious.

But it is true that the opposite change (static method turns into constructor) would cause the lint to ask for the type annotations to be removed, if any.

harder to notice non-redundant type annotations

That's an interesting point!

It could be useful to be able to detect the cases where the specification of a declared type changes the run-time semantics of the initializing expression (because type inference, coercion, generic function instantiation, etc. have a different outcome).

However, I don't see how a mere upcast would be more valuable to document than a non-obvious type. After all, if we seriously don't care about the type then why is there an upcast in the first place? And if we do care about the type then why must it be impossible to document that type just because a potentially very complex type inference, coercion etc. step produced exactly the same type as the one that we wish to document?

munificent commented 2 years ago

If you can't understand a short function without redundantly type annotating a local variable ..

So what's the type of var y = foo(x);?

In:

bar(foo(x))`;

What's the type of argument passed to bar()?

No, I'm not going to tell you which type arguments foo accepts, and even the tools aren't going to tell you how type inference proceeded to infer types for that expression. The point is that you need to look it up, and foo is of course in some other package. (OK, you can hover in an IDE, but we do encounter code in other contexts as well.)

All of this applies to my example too. We seem to be perfectly happy with not seeing type annotations on subexpressions, so I don't see a very compelling argument for them on local variables.

I don't understand why we need to shout at people who prefer to inform readers about such non-obvious types, in some carefully selected particularly complex situations.

The linter shouts at people for lots of reasons that absolutely don't matter at all when it comes to understanding their code. The answer for that is almost always that consistency has its own value. If we let people redundantly annotate local variables, then readers will be presented with some code that chooses to do that and some code that chooses to not. That difference will get their attention and distract them from other aspects of the code. They may wonder, "Is this an upcast?" (Or worse, it will be an upcast and they won't realize it because they think it's just a redundant annotation.) They may wonder why the author chose to annotate this one and not others nearby. Was there a reason for that, or was it just that one author had one preference and other authors of nearby code had others?

Overall, I haven't seen any compelling evidence that allowing variation here carries its weight.

Any user who wishes to omit the type of a local variable can freely do so because the lint will never ask for the type to be added, it will only ask to get the type removed (when it's obvious).

They can opt out of annotating in code they write, but they can't opt out of being presented with code written by others who made a different choice.

I don't think that's very likely to happen, presumably the obvious types are obvious, and will remain so.

I have literally watched this happen on large teams multiple times over my career, including in Dart.

I don't see how a mere upcast would be more valuable to document than a non-obvious type.

Because it tells the reader that the variable's type is not what they would intuit from the initializer expression.

After all, if we seriously don't care about the type then why is there an upcast in the first place?

So that the variable can be assigned a value later whose type isn't a subtype of the initializer's type. So seeing the type annotation on the local sends a useful signal of "hey this is going to be mutated later to a different type".

And if we do care about the type then why must it be impossible to document that type just because a potentially very complex type inference, coercion etc. step produced exactly the same type as the one that we wish to document?

It's not impossible. This is just a lint. But the lint does reflect what we think leads to the best, most consistently readable code across the ecosystem.

eernstg commented 2 years ago

In:
bar(foo(x));
What's the type of argument passed to bar()?

[it is not trivial to compute such types]

All of this applies to my example too.

Certainly, but the point is that software engineering quality includes readability and comprehensibility as core elements, and one of the foundational tools used to improve on the readability of a snippet of code is to split up large expressions by evaluating subexpressions and storing their value in local variables.

So let's say that bar(foo(x)) is sufficiently complex to justify some refactorings. The first one would be this:

var name = foo(x); // Where `name` is chosen such that it helps the reader.
... bar(name) ...

One difficulty with this kind of refactoring is that the type inference process works differently when foo(x) has no context type, so we may have to add a declared type to name in order to preserve the semantics:

SomeType name = foo(x);
... bar(name) ...

However, the declared type can be helpful for a reader of the code, because it helps them understand the meaning and purpose of name, and that could make it easier to read and understand expressions like bar(name).

I'm simply saying "let's stop shouting at the developer" in the case where SomeType has been declared for readability reasons.

If we let people redundantly annotate local variables, then readers will be presented with some code that chooses to do that and some code that chooses to not.

This would be a likely outcome if half of the community enables omit_local_variable_type and the other half enables always_specify_types.

What I'm proposing is that the people who agree with me that documenting local types can be good for readability in a few cases can enable omit_local_variable_type, because it won't shout at them when they do want a type annotation.

But in their daily work they'll have lots of little nudges in the direction of omitting that type, because there are so many cases where the type is obvious, and omit_local_variable_type will (rightfully) complain in those cases.

So the few local variables with a type will stand out, and it's a rather safe bet that the developer who wrote it wanted to have the type in that particular case because they needed to think hard about that particular type when they wrote the code, and then it's probably helpful to have it there when we read the code.

So the proposal will very specifically not nudge developers in the direction of writing the types just because that's a habit, each local variable with a type is there because someone actively made that declaration different from most others.

They can opt out of annotating in code they write, but they can't opt out of being presented with code written by others who made a different choice.

Today, those others would simply not enable omit_local_variable_type, and then there's no limit on the number of local variable types in their code.

With this proposal, they might accept enabling omit_local_variable_type because it allows them to use type declarations when they are helpful, and the net result would be that the community as a whole would move in the direction of omitting the declared type most of the time, rather than being more or less divided.

But the lint does reflect what we think leads to the best, most consistently readable code across the ecosystem.

We don't quite agree. ;-)

It just rubs me the wrong way to force developers into lowering the software engineering quality of their code, just because we refuse to give them a little bit of extra freedom to use human judgment when needed.

goderbauer commented 2 years ago

So far, omit_local_variable_type hasn't made the cut for the recommended set of lints in package:lints because some feel it is too strict and removes type annotations that are actually useful in understanding the code (see also https://github.com/dart-lang/lints/issues/24). I have to read a lot of code in environments (e.g. on github) where I don't have an IDE handy that can infer the types for me - and I certainly appreciate the inclusions of types there. I find that reviews of google-internal code are actually harder to do due to the missing types because I often have to context switch to look up the non-obvious return type of a function.

The proposal in this issue sounds like a nice compromise to me between never specifying types and always specifying types since the latter can get tedious in the obvious cases as well. If a middle ground is implemented, it would probably be easier to get that included in package:lints and push people towards removing type annotations from more lines of code, where they are not hindering readability.

pq commented 4 months ago

I think this is worth getting some real world feedback on. Given @goderbauer's endorsement, we might even see some traction in flutter which would be fantastic. I'd support an experimental lint implementation that can be experimented with and iterated on. Ideally with links to the relevant conversations and a caveat in the docs.

eernstg commented 4 months ago

https://dart-review.googlesource.com/c/sdk/+/374601

bwilkerson commented 4 months ago

@johnniwinther

natebosch commented 4 months ago

IIRC at least some of the folks who did not want to use omit_local_variable_types were also opposed to leaving the decision up to authors and reviewers since that can cause churn durning reviews. We should make sure that folks are OK with many type annotations being up to author/reviewer discretion in a lint that only considers a subset of local variables. cc @devoncarew @matanlurey

goderbauer commented 4 months ago

were also opposed to leaving the decision up to authors and reviewers since that can cause churn durning reviews.

Yes, I for one would strongly prefer that we can fully automate where types need to go and where not to avoid back and forth in code reviews about this.

matanlurey commented 4 months ago

@natebosch:

We should make sure that folks are OK with many type annotations being up to author/reviewer discretion

I think if we can get to a point where types are almost always omitted, but not flagged in some cases, that's good enough (I believe that's @eernstg's prototype above). I do wonder if it's just easier to push us on to omit_local_variable_types though.

@goderbauer:

I ... prefer that we can fully automate

There isn't a world where we are going to write the perfect the lint that will be universally adopted with the exact definition of "ambiguous" type, so we either need to be pragmatic (allow code-reviewer/author-driven type annotations) or we need to be dogmatic (just use omit_local_variable_types).

In other words, I understand your request @goderbauer - I don't think it's feasible.

Consider the following cases:

// Should this be OK?
var x = 5;

// Should this be OK?
int x = 5;

T identity<T>(T value) => value;

// Should this be OK?
var x = identity(5);

// Should this be OK?
int x = identity(5);

// Should this be OK?
var x = identity<int>(5);

// Should this be OK?
int x = identity<int>(5);

This is a tiny subset of the cases that will need to be considered. For example in @eernstg's prototype:

  test_10_newGeneric_ok() async {
    await assertNoDiagnostics(r'''
f() {
  A<num> a = A<int>();
}

class A<X> {}
''');
  }

This doesn't generate a diagnostic, but AFAICT neither does this:

  test_10B_newGeneric_withVarTypes() async {
    await assertNoDiagnostics(r'''
f() {
  var a = A<num>();
}

class A<X> {}
''');
  }

In other words, authors/reviewers now have a choice between multiple styles.

eernstg commented 4 months ago

Those points are well taken!

@natebosch wrote:

... leaving the decision up to authors and reviewers .. can cause churn durning reviews

Creation of software has many degrees of freedom, and we already entrust authors and reviewers with a lot of freedom to make good decisions about a huge number of things. That's the way it should be! For example, they make choices all the time about which local variables to create in the first place:

void main() {
  // Approach 1.
  var x = someExpression;
  foo(x);

  // Approach 2.
  foo(someExpression);
}

The difference does matter (it affects the readability of the code, it may change the type of someExpression, etc), and it introduces (going from approach 2 to 1) or eliminates (1 to 2) the question about whether or not to specify a declared type for that local variable.

However, I don't see us introducing rigid rules about preferring approach 1 over 2 or vice versa, based on the complexity of the expression (or indeed based on anything).

Similarly, I think we can and should rely on good human judgment about local variable type annotations. Surely we can handle that!

a lint that only considers a subset of local variables.

We already have a situation where omit_local_variable_types, if enabled, will allow a mixture of local variable declarations with and without a type annotation.

num i = 1; // No complaints from `omit_local_variable_types`.

omit_obvious_local_variable_types will just broaden the set of cases where the choice is left to the developer.

@goderbauer wrote:

I for one would strongly prefer that we can fully automate where types need to go

I agree with @matanlurey here, I honestly don't think that's realistic. We could approach it from above (saying "take away this type annotation" in cases where it is considered obviously unnecessary) and from below (saying "add a type annotation here" for whatever reason, e.g., "the initializing expression has a non-obvious type").

omit_obvious_local_variable_types is intended to perform the approximation from above, in a way that should not be controversial, based on the assumption that we can all (or nearly all ;-) agree that the expressions whose type is considered trivial do actually have a type which is easy to see locally.

I suspect that the approximation from below would be much more difficult to support mechanically. To give a silly example, we could single out local variable declarations whose initializing expressions have at least 3 of the following dependencies:

It is an invocation of a generic function/method from another package.
Type inference has provided an actual type argument somewhere in this expression which is not mentioned explicitly (so foo() is inferred as foo<C>() and a textual search for C in the surrounding code brings up nothing).
The type of this expression depends on the type of one or more promoted variables.
The type of this expression relies on the least upper or lower bound of several expressions.
This expression contains an invocation of an extension member r.m, and (1) the type R of the receiver r was inferred, and (2) there were multiple accessible and available extension members named m (so if R had a slightly different value we might call a completely different piece of code).

There are lots of ways to establish delicate typing related dependencies, and I'm just saying that we shouldn't yell at people who want to narrow the ambiguity a some situations by declaring the type of a local variable.

@matanlurey wrote:

In other words, authors/reviewers now have a choice between multiple styles.

Indeed! But I'd say that this is inherently a property of software construction and maintenance, and we should be able to handle it also in the particular case of type annotations on local variables. Moreover, omit_local_variable_types also permits both A<num> a = A<int>(); and var a = A<num>();.

matanlurey commented 4 months ago

To be clear my comments about choice seeming like a bad thing are in the context of wanting a single enforceable lint that does everything.

(In my own projects I happily use omit_local_variable_types with no issue, but having a omit_local_variable_types_sometimes if it doesn't have opinionated heuristics about when types are required seems like a reasonable workaround for the Dash org).

davidmorgan commented 3 months ago

The fact that the lint permits num i = 1 is a carefully chosen feature :) ... it means that the only use of type annotations on the LHS is for upcasts; because upcasts are statically safe it's nice to have a way to write them that's distinct from the only-checked-at-runtime as.

This lint is pretty much a cornerstone of google3 Dart style after being enforced 5 years ago, if there is a need for a variant with heuristics then I do think a new lint would be better than a change to this one.

Thanks.

eernstg commented 3 months ago

The fact that the lint permits num i = 1 is a carefully chosen feature :)

Certainly, and this property is preserved. Note, however, that omit_local_variable_types does not recognize many situations where a declared type matters, which means that the same feature could be broadened substantially.

List<X> listOfWhatever<X>() => <X>[];

void main() {
  double d = 1; // LINT.
  List<int> xs = listOfWhatever(); // LINT.
}

if there is a need for a variant with heuristics then I do think a new lint would be better than a change to this one.

Right, and https://dart-review.googlesource.com/c/sdk/+/374601 is indeed a separate lint whose name is different (omit_obvious_local_variable_types).

eernstg commented 3 months ago

omit_obvious_local_variable_types is now available in the bleeding edge SDK, with the following analysis_options.yaml:

linter:
  rules:
    - omit_obvious_local_variable_types

Feedback is welcome, of course!

devoncarew commented 3 months ago

FWIW, this is firing in fewer situations that I would expect (i.e., final String name = resultObject['name'] as String; doesn't show a lint).

eernstg commented 3 months ago

final String name = resultObject['name'] as String; doesn't show a lint

Good point, if the initializing expression is a cast then the type is stated explicitly.

The idea is that "obvious" types should be approximated from below, that is, the set of expressions whose type is considered obvious according to omit_obvious_local_variable_types should only include expressions whose type is essentially universally accepted as being obvious. I think the cast expression is a good candidate for inclusion.

One thing to note is that e may be a complex expression in e as T, and the structure of the expression may be such that it isn't obvious that as T applies to e as a whole (for instance, 1 + 1 as double might mean (1 + 1) as double or 1 + (1 as double), and a reader might need to stop and think in order to know which one it is).

natebosch commented 3 months ago

and a reader might need to stop and think in order to know which one it is

In those cases the reader would benefit more from some parenthesis than a redundant type on the left. I think any e as T should be considered an obvious type regardless of the content of e.

eernstg commented 3 months ago

Makes sense!

I'm gathering this kind of input. I'll create a CL to include these generalizations when we have sort of a common understanding that they should be included (that is, we've had a little time to think about it, and nobody is complaining loudly ;-).

Hixie commented 3 months ago

This is great!

One problem I ran into with this is that ambiguous numeric literals (e.g. 1) want to not have an explicit type, but that's quite misleading, e.g.:

var w = 0.8;
var x = 0.9;
var y = 1;
var z = 1.1;

...does not have four variables with the same types. I think we should not consider integer literals to imply a type. It can introduce weirdly subtle bugs.

Hixie commented 3 months ago

Consider for example a case like:

for (var x = 0; x < size.width; x += 8) {
  // what type is x?
}

You want x to be a double in this case, but instead it's an int.

Given that int and var are the same length, it seems especially harmless for us to just use int here.

Hixie commented 3 months ago

I noticed it doesn't ask for the type when you omit it for non-obvious cases. Is that intentional? Is there another lint I should be adding to require types in non-obvious cases?

bwilkerson commented 3 months ago

IMO it would be better for users to have a single lint that enforced the whole style rather than to have two lints. Having two lints would (a) make it too easy to enable one and forget to enable the other, and (b) make it easier for the results to be inconsistent. Unless, of course, we have evidence that enough users want one and not the other, but I don't expect that to be the case.

natebosch commented 3 months ago

Yes, it is intentional that the new lint only restricts some type annotations, and does not require them.

We do not have any proposal for a lint that fully restricts and requires type annotations against any heuristic. We expect that trying to develop such a lint is unlikely to succeed and I'm not aware of anyone putting effort towards it.

Hixie commented 3 months ago

Is there some way to have a lint that's the opposite of "omit_obvious_local_variable_types"? It's not clear to me how I would use "omit_obvious_local_variable_types" if all it does is complain when someone adds a type -- adding a type is really never the problem, it's omitting the type when it's not obvious that's the problem. At least for me.

Such a lint would not require any additional design work on the heuristic, it would just literally be the opposite of what we have here. Like "always_specify_types" but minus the cases that "omit_obvious_local_variable_types" doesn't like.

natebosch commented 3 months ago

I think it's unlikely that require_nonobvious_local_variable_types defined as the inverse of omit_obvious_local_variable_types would get usage. It would be too noisy since omit_obvious favors false negatives over false positives.

I have not seen any specific proposals for a more conservative heuristic requiring types.

Hixie commented 3 months ago

The only use case I'm aware of for this lint is folks who want to require types everywhere, but want to reduce the verbosity this introduces where types are redundantly duplicated. They can't use omit_obvious_local_variable_types as implemented since it doesn't require types, just disallows them some places.

Can you talk more about who would use this lint as designed? What use case is it solving?

natebosch commented 3 months ago

As of today, we don't have any known user for omit_obvious_local_variable_types as designed. AFAIK the plan is to keep it experimental until it finds a user, or we remove it. We're experimenting with lints that may help ease some folks concerns with dropping always_specify_types or other folks concerns with not adopting omit_local_variable_types. We've of course heard the requests for a lint that makes a decision for 100% of local variables (and isn't either of those lints), and we remain skeptical that such a lint design is feasible.

If someone wants to implement a fix for omit_obvious_local_variable_types (the fix for omit_local_variable_types should be a good start) it would be easy to run fixes for this lint across a repo currently using always_specify_types to gather evidence about whether such an inverse lint design would be worth implementing.

bwilkerson commented 3 months ago

The fix for omit_local_variable_types is already associated with omit_obvious_local_variable_types, so there shouldn't be any work required to be able to apply the fix.

Hixie commented 3 months ago

A version of omit_obvious_local_variable_types changed as I described above (disallow obvious types, require unobvious types, consider integer literals unobvious) would be very likely to be adopted by flutter/* across the board as a replacement for always_specify_types.

natebosch commented 3 months ago

The fix for omit_local_variable_types is already associated with omit_obvious_local_variable_types, so there shouldn't be any work required to be able to apply the fix.

Ah, indeed it is. My SDK was a few weeks out of date and I had the lint but not the fix. It's working great on the latest SDK.

would be very likely to be adopted by flutter/* across the board as a replacement for always_specify_types.

It should be easy to apply this lint in any repo you'd like to gather evidence to support such a design. I don't think I've seen other advocating for it so far.

Hixie commented 3 months ago

I've been testing it locally (that's what the feedback above is based on). Can you elaborate on what evidence and advocacy you'd like to see? I'm not sure I understand what you mean.

Currently Flutter uses always_specify_types, but there's a lot of desire within the team to be able to omit "obvious" types to reduce unnecessary verbosity.

natebosch commented 3 months ago

Can you elaborate on what evidence and advocacy you'd like to see?

Anything that would convince more folks that this design is worth pursuing.

Hixie commented 3 months ago

"Flutter would use this" has historically been sufficiently convincing, is that no longer the case?

johnniwinther commented 3 months ago

@Hixie what you're looking for might be what we want for the CFE, a "top-down" approach where types are used unless they are implied by (type) context. This means that you would have types on local variables, but not necessarily on literals, constructor invocations and closure parameters:

class Foo<T> {}
method() {
  int i = 1; // There is no context for the declaration, so the type is required.
  List<int> list = []; // The type argument for the literal is implied by the context `List<int>`.
  Foo<int> foo = Foo(); // Type type argument for the constructor invocation is implied by the context `Foo<int>`.
  list.forEach((e) => print(e)); // The parameter type for `e` is implied to be `int` by the context.
  List<num> numList = list.map<num>((e) => e * 2).toList(); // The parameter type is implied but the type argument is not.
  Set<num> set = <int>{0}; // The implied type argument would be `num` so `int` is required and therefore allowed.

This rule cannot currently be implemented by lints because they currently don't have access to the inference engine. This, though, is a general deficiency of the current linter. I think users think of Dart type in terms of what can/should be inferred, which I think the omit_obvious_local_variable_types is an example of. The proposed heuritic rules for omit_obvious_local_variable_types to me seems like a work-around of this deficiency. Ideally the linter should be able to use the query "would this type (argument) have been inferred if omitted?".

eernstg commented 3 months ago

@Hixie wrote:

I think we should not consider integer literals to imply a type.

Good point.

Here's the current situation: omit_obvious_local_variable_types will not flag double i = 1; (as opposed to omit_local_variable_types which will flag it). This means that it is possible to use integer literals with static type double as local variable initializers. So nothing stops us if we want to document this.

For the converse, omit_obvious_local_variable_types will flag int i = 1;, and developers would then be nudged in the direction of using var i = 1;.

This means that developers would have to use the following rule-of-thought in order to determine the type of a local variable whose initializer is a integer literal: "var means int, double means double".

I do recognize that we could change this to "int means int and double means double" by taking integer literals out of the set of expressions whose type is considered obvious.

What do you all think, do we want to allow int i = 1; because the integer-to-double coercion exists, and var i = 1; doesn't make it explicit that it isn't applied in that particular location?

eernstg commented 3 months ago

@bwilkerson wrote:

IMO it would be better for users to have a single lint that enforced the whole style rather than to have two lints.

Arguably we can (and should?) support several styles:

omit_obvious_local_variable_types alone: Allow var every time, as today with omit_local_variable_types, and push developers in the direction of var by flagging the "obvious" cases. This could be called Avoid declaring the types of local variables unless you really want to do it.
A new specify_nonobvious_local_variable_types alone: Allow explicitly declared types everywhere, and push developers in the direction of explicitly declared types by flagging some "non-obvious" cases. This could be called Declare the types of local variables unless it's obvious.
Both: Push developers in the direction of declaring non-obvious types and using var in obvious cases. A substantial set of cases are left open to human judgment, in which case neither var nor a declared type is flagged. This could be called Declare the types of local variables iff it is non-obvious.

I agree with @natebosch that it is unlikely to be possible to mechanically detect exactly which types are "obvious" and which ones are "non-obvious". Hence the distance between the two lints.

We could of course provide the behavior of both together as a single lint, but I'm not convinced that this would be a better service to the community than offering the option to choose any of the above mentioned variants.

In any case, it's likely to be listed in standardized versions of analysis_options.yaml, which means that developers won't type those lint names over and over again, they'll just use the setup which is standard for their organization. So it probably doesn't make anything less convenient or less safe that they are using two lints rather than one (if they are indeed using both).

bwilkerson commented 3 months ago

... it is unlikely to be possible to mechanically detect exactly which types are "obvious" and which ones are "non-obvious".

I don't understand that claim. If we define "non-obvious" to be the Boolean negation of "obvious", then I think that gives us exactly the dichotomy we need for the options in your list to make sense.

What I can understand, and agree with, is that it's impossible to define "obvious" in a way that all users will agree with it. The best we can hope for here is to have something that is acceptable to a majority of users. Otherwise our only recourse is to allow users to configure what they (individually or corporately) mean by "obvious", which is a path we don't want to go down.

I tend to think that three options is overkill, but if we really think that users want that many options then I'd at least want a single method/function to compute "obvious" and to use the negation of that for the non-obvious case. If we don't do that, then the definitions of "obvious" and "non-obvious" will diverge over time leading to either false negatives or unsatisfiable constraints, neither of which is good for users.

eernstg commented 3 months ago

I don't understand that claim. If we define "non-obvious" to be the Boolean negation of "obvious"

That was not the intention. I think there will be too many situations where the outcome is unnecessarily rigid, so I'm aiming for an approach where some expressions are considered obviously typed, other expressions are considered to have a type which is obviously non-obvious :grinning:, and the remaining expressions are handled by developers who will use human judgment to arrive at a choice.

I don't think it will be too hard to avoid contradictions because the characterization of "obvious" and "obviously non-obvious" will be similar and not very complex. But I do agree that we must consider every contradiction to be a bug, should they ever arise.

That said, it would of course be very easy to eliminate the slack and just require (softly, because it's a lint) that every local variable with an initializing expression must have a declared type unless the initializing expression has an obvious type.

eernstg commented 3 months ago

@johnniwinther wrote:

Ideally the linter should be able to use the query "would this type (argument) have been inferred if omitted?".

Agreed, the linter should have access to more analysis power in many cases, and this particular kind of query would be great to have for omit_local_variable_types.

However, we may not need it for omit_obvious_local_variable_types, because almost every expression whose type is considered obvious has the property that inference makes no difference.

Consider the cases in a hint of a proof by induction:

e is a non-collection literal: An integer literal with static type double is not considered to have an obvious type; an integer literal with static type int has an obvious type, and it will still have type int if the context type is changed to _.
e is a collection literal with actual type arguments: Inference will not change the type of that collection literal.
e is a list literal or a set literal where at least one element has an obvious type, and all elements have the same type: The elements with an obvious type will (by the induction hypothesis) not have a type that depends on type inference. The remaining elements are not covered by this analysis.
e is a map literal where at least one key-value pair has a key with an obvious type and a value with an obvious type, and all keys have the same type, and all values have the same type: Similar to lists and sets.
e is an instance creation expression whose class part is not raw: The class is then non-generic, or it is generic and actual type arguments are specified explicitly; the type of such an expression does not depend on type inference.
e is a cascade whose target has an obvious type: The type of the expression is determined by its target, whose type (by the induction hypothesis) doesn't depend on type inference.

The interesting case is something like the following:

X whatever<X>() {
  print(X);
  throw 0;
}

void main() {
  var xs = [1, whatever()];
}

The invocation of whatever will receive the actual type argument dynamic and the list will have inferred type List<dynamic>. This means that if we change the declared type of xs to List<int> then the invocation of whatever will receive the actual type argument int, and the type of the list will be List<int> (so there is no error).

I'm changing omit_obvious_local_variable_types to be slightly more selective in the collection and map cases in https://dart-review.googlesource.com/c/sdk/+/378941, which will also change e as T to be obviously typed.

With that, we can safely claim that obviously typed expressions have a static type that doesn't depend on inference, so we don't need to answer the question "would this type (argument) have been inferred if omitted?".

bwilkerson commented 3 months ago

Ideally the linter should be able to use the query "would this type (argument) have been inferred if omitted?".

I also agree that I'd like the linter to be able to ask "what if" questions like this. I believe that's largely the impetus behind the "wolf" analysis work that @stereotype441 started.

But in this case that question will only help in terms of being able to flag explicit type annotations that aren't required. It won't help in the opposite case of type annotations that ought to be provided but aren't.

eernstg commented 3 months ago

It won't help in the opposite case of type annotations that ought to be provided but aren't.

True, that's what I'm creating specify_nonobvious_local_variable_types for. I won't get around to it before tomorrow, though.

eernstg commented 3 months ago

Oh, and specify_nonobvious_local_variable_types would also not need to be able to answer the question "would this type (argument) have been inferred if omitted?" because this is a lint that puts the emphasis on detecting that an expression has a non-obvious type—and we don't care whether or not type inference would produce the same type, because that particular class of expressions do not have simple type inference anyway.

natebosch commented 3 months ago

I think users think of Dart type in terms of what can/should be inferred, which I think the omit_obvious_local_variable_types is an example of. The proposed heuritic rules for omit_obvious_local_variable_types to me seems like a work-around of this deficiency. Ideally the linter should be able to use the query "would this type (argument) have been inferred if omitted?".

Querying whether a local variable type would be inferred by the compiler is exactly the intent of omit_local_variable_types. There are implementation gaps, but if we can fix them we should. We want it to work that way because it adds value - local variable types carry more signal than before because only the "meaningful" local types are present in source.

I don't understand the distinction you are making between omit_local_variable_types and omit_obvious_local_variable_types as it relates to inferred types. The heuristics are the entire purpose of the "obvious" lint - to avoid flagging types that are inferred by the compiler but potentially "nonobvious" to a human reader.

What I can understand, and agree with, is that it's impossible to define "obvious" in a way that all users will agree with it. The best we can hope for here is to have something that is acceptable to a majority of users. Otherwise our only recourse is to allow users to configure what they (individually or corporately) mean by "obvious", which is a path we don't want to go down.

I don't think even an individual user can define a set of hard rules for which local variable types are "obvious" and which aren't, in a similar way to how an individual can't define a set of hard rules for what makes a variable name readable.

natebosch commented 3 months ago

I tried running always_specify_types then omit_obvious_local_variable_types on pkg/front_end to see whether that proposal is close to what the CFE project is already doing.

The bulk of the difference is adding redundant type information in:

Generics on collection literals where the variable assignment was the type context (changes to var and moves the generics to the literal)
Generics on collection literals with a type context
Argument types for function literals

johnniwinther commented 3 months ago

To the CFE the types are not just a consequence of the data flow but instead invariants that guide and structure the code. omit_obvious_local_variable_types and omit_local_variable_types use types as a consequence of data - for instance saying that var o = foo as Bar; is an obvious Bar misses the fact that as Bar pertains only to the expression in question and not invariantly. This typing style implies that for now the type is so-and-so, but tomorrow it might be something else.

dart-lang / linter