dart-lang / language

Design of the Dart language
Other
2.67k stars 205 forks source link

Statement metadata #1652

Open lrhn opened 3 years ago

lrhn commented 3 years ago

Dart currently allows metadata annotations (@foo or @Foo(args), or even @Foo<types>(args) soon) on declarations. That makes sense if the primary way to access metadata is through dart:mirrors, and you can't mirror something you can't name. However, annotations are also used by source-processing tools like the analyzer and some code generators. They can find annotations anywhere in the source.

Assume that we would want to allow metadata annotations on statements in general.

Annotations have no semantic effect on the Dart program itself, so it's all about sending signals to tools. Say, to tell the analyzer to disable a lint.

Syntax

The syntax for annotations is currently @id, @SomeClass(args) or (soon) @SomeClass<typeArgs>(args).

The things you want to express for statements are not different from what you want to express for declarations, so it would be nice if all these annotations would be allowed on statements. That's a problem, though, because it introduces grammar ambiguities if we just put the annotation before the statement, because an expression-statement allows almost any expression to follow it. Example:

@foo (a + b) [c];

this can be parsed as the annotation constructor invocation @foo(a+b) on the expression statement [c]; or the annotation identifier @foo on the expression statement (a+b)[c]; Adding more parentheses to the expression is not necessarily a solution which scales (although if we disallow annotations on the empty statement, ;, then @foo((a + b)[c]); can only be parsed in one way).

I'd recommend introducing a separator. Just like statement labels need a :, we could say that statement annotations need a :, and you have to write one of:

@foo:(a + b)[c];
@foo(a + b):[c];

That also suggests a formatting style where the annotation, like the label, is placed above the statement:

@foo(a+b):
[c];

Formatting issues are handled by treating statement annotations exactly like we treat labels.

That might be confusing when compared to existing annotations on local functions (which we allow), which don't need the trailing ::

@foo
int bar(int x) => x;

We may want to allow that annotation to have a : too, when it occurs inside a member body.

This suggests allowing the <metadata-with-colon> production where we could otherwise add a label.

Expression annotations

With a delimiter like @...: for annotations, we could potentially also allow annotations on expressions. It's more complicated because expressions have precedence, so @foo:a + b could mean (@foo: a) + b or (@foo:(a+b)). If we put annotations on the expression itself, so it can contain any expression, then you can always parenthesize the scope you want, (@foo: any+expression). It's an ambiguity with statement annotations for @foo:expr;, but we can just always consider that a statement annotation and make you write (@foo:expr); if you want it on the expression. In practice, you probably always want to parenthesize, so the maybe only allow it just inside parentheses. Change:

<primary> ::= ... 
    | `(' expression `)'
...
primary ::= ...
    | `(' <metadata-with-colon>* <expression> `')

Summary

Allow a metadata production followed by a colon everywhere we allow a statement label. Possibly, allow any number of metadata productions followed by colons (or just one colon?) inside at the start of a parenthesized expression.

I believe the grammar and parsing should be manageable.

lrhn commented 3 years ago

For context, consider lints where you want to mark something as OK, say @unawaited to mark an unawaited_futures lint as OK, instead of using the unawaited function from package:pedantic. Marking a statement, or even an expression, with an annotation is a way to do that, one which doesn't also mean something in the language semantics.

I have a problem with parentheses if we don't do something, like add the required :, because annotations can end in parentheses, and expressions can start with them. We definitely want the annotations before the statement. so @foo (a + b) [c]; is an ambiguous annotated expression statement.

If you want it to mean @foo on the statement (a+b)[c];, you'd use parentheses to force the precedence. But it doesn't get much better by writing it as @foo ((a + b)[c]);, because the parentheses you add to decide the precedence just introduce a new ambiguity (because ; is a valid empty statement).

Another option is to parenthesize the annotation, (@foo) (a+b)[c];. It can work, but looks weird, and if we do want to annotate expressions (which might be necessary for unawaited), it gets complicated. Is (@foo(a))-b; the annotation @foo(a) on the statement -b; or is the @foo on the expression a.

That's why my proposed solution is to use (`@' <metadatum> `:')* and ``(' @' <metadataum>:')* `)' `` as the places where you can introduce metadata on statements or experssions, the former in the same places you can introduce labels, the latter in the "parenthesized expression" production.

lrhn commented 3 years ago

The problem is that (@foo (a+b) [c]) is still ambiguous. We can disambiguate it in two ways:

Whichever we choose as the default interpretation, we must then consider what you must add to get the other interpretation.

For @foo(a+b) and [c], we would write (@foo ((a+b)[c])), and as you say, if there must be some expression, the @foo must be a stand-alone annotation.

For @foo and (a+b)[c], I don't know what I'd do to associate differently. We already didn't combine it with the parentheses before. Adding more parentheses shouldn't matter.

In either case, makes very similar syntax mean something different based on whether there is a far-away expression or not. That's not good for readability. It's bad for readability if you cannot quickly determine how to read the code, and there is no pattern of punctuation here which is unambiguous.

That's why I consider @....: as such a pattern. If you see @ in statement context, look forward for the colon (which is always after a ) if it's not right after the identifier) to see where it stops.

@foo(a + b + c + d ~e): [c];
@bar:(a + b + c + d ~e)[c];

Much easier to distinguish.

leafpetersen commented 3 years ago

cc @jacob314 @davidmorgan

lrhn commented 3 years ago

The parser does not know the types. We only figure out what types are after having parsed the syntax of the program.

jacob314 commented 3 years ago

What if statement level annotations are not allowed to contain parens? This doesn't limit the expressiveness as you can always define a const on a separate line that includes the full expression with parens. This would ensure that statement level annotations are easy to read, parse, and format without the ambiguity for users and parsers. All the use cases I can come up with for statement level annotations do not benefit from annotations with parens. I would be interested if anyone can think of a case where statement level annotations would benefit from parens. If we do want to support statement level annotations with parens, we could require the : syntax only for those statement level annotations giving the best of both worlds at the cost of code that is a bit harder to format and read.

Examples:


@foo (a + b)[c]; // parses as `@foo`

// To write:
@Foo(a+b) d[e]

// You would have to write

const fooAb = @Foo(a+b);
@fooAb d[e];

// or:
@Foo(a+b) : d[e]
bwilkerson commented 3 years ago

The @Deprecated('what to use instead') annotation is probably the best known example of an annotation with arguments that gets applied to methods, but there are others.

I'm not aware of any exiting annotations with arguments that would apply to statements, but that might be because most of our existing annotations are targeted at helping clients of an API use the API correctly, so I'm not sure we have any annotations at the moment that would apply at the statement level.

That said, it seems to me that using an argument to explain why the annotation is there would be a reasonable use case even for statement level annotations (such as why it was ok to not await a future in this context). Although I admit that is seems like a comment would be equally good in all such cases if we choose not to allow the constructor invocation form of annotations on statements.

rakudrama commented 3 years ago

Another common annotation-with-arguments is @pragma.

One might reasonably expect the compiler annotations @pragma('vm:never-inline'), @pragma('dart2js:noInline') etc., to be extended to apply to call expressions.

Consider if we want to encourage inlining of bar in foo(bar(baz())). With an expression annotation

   foo(@pragma('vm:prefer-inline'):bar(baz()))

it is not crystal clear that the annotation applies only to bar and not baz. The scope of an annotation should have a syntactic definition. If the scope of the annotation is left up to a tool, similar tools may implement different scopes, for example

   foo(@pragma('vm:prefer-inline') @pragma('dart2js:tryInline'):bar(baz()))

might inline bar on the VM and bar and baz on dart2js, or vice-versa.

I'm not sure we really need annotations on expressions, though, as most cases of an expression can be rewritten to use an intermediate variable declaration with an initializer expression, and the annotation could be applied to the variable, meaning the corresponding initializer expression. The step-wise breakdown using a current valid annotation context is clearer as there is only one call to inline:

var z = baz();
@pragma('vm:prefer-inline')
var b = bar(z);
... foo(b) ...

If the syntax @...: is adopted for statements and/or expressions, please consider allowing it in the current declaration contexts. Otherwise it becomes another mildly unpleasant syntactic quirk, a colon being required sometimes but an error other times. (I repeatedly find myself typing a semicolon after annotations-with-arguments on methods, perhaps this could be allowed too?)

Levi-Lesches commented 3 years ago

Can we expand a bit on why we'd want annotations on statements in the first place? Especially when comments are available, I'm not sure what annotations would do. If we want tooling to have access to them, we already have special comments like // ignore: that are pretty standardized.

Having a concrete example might help with the reasoning and syntax here.

munificent commented 3 years ago

Can we expand a bit on why we'd want annotations on statements in the first place?

The initial impetus, mentioned briefly here is to take the unawaited() function in package:pedantic and move it to something more fully integrated into the language. That could simply mean moving that function to dart:core or dart:async, but we also discussed making it an annotation instead of a function:

main() {
  @unawaited
  doSomethingAsyncButIgnoreTheReturnedFuture();
}

Related: https://github.com/dart-lang/language/issues/1661

Levi-Lesches commented 3 years ago

I'm still not sold on why these are better than regular comments. I always use an // ignore in favor of unawaited:

main() async {
  // ignore: unawaited_futures
  doSomethingAsyncButIgnoreTheReturnedFuture();
}

More generally than my own preferences, it seems like the use of annotations here would be to explain what we're doing and maybe also why we're doing it:

consider lints where you want to mark something as OK, say @unawaited to mark an unawaited_futures lint as OK

Isn't this exactly what comments are for? I get the feeling we're trying to re-invent comments as "metadata that can go before anything". If tooling is a concern, we can use comments in a standard form, such as the // ignore above.

main() {
  int index = 5000;  // this number is special to our company because...
  myList [index];  // don't worry, this is a REALLY big list
}

The example above might otherwise be two lints, avoid_magic_numbers and avoid_large_indices, that would need annotations to say it's okay. But with comments, you avoid the problem entirely while still conveying your intentions to the reader.

rrousselGit commented 3 years ago

I agree with Levi.

I really dislike the "unawaited" function. It both decreases readability by changing how the line formats, is different from how we typically disable lints, and still require a comment to explain why "unawaited is used"

// ignore: unawaited_future, explanation why we don't await the future
fetch()

is much better than:

// explanation why we don't await the future
unawaited(
  fetch(),
);
om-ha commented 3 years ago

I think unawaited_futures linter rule is very important as it saves developers from forgetting an await. However the ability to selectively ignore this rule is important for a fire-and-forget scenario.

As per the lint docs, there are two options to ignore this rule in a fire-and-forget situation where you know what you're doing:

  1. unawaited from pedantic package. Don't know why this is still recommended on the lint docs, because pedantic package is deprecated.
  2. // ignore or // ignore_for_file

I prefer going with 2. the ignore-comment method as @Levi-Lesches and @rrousselGit suggested. From what I read in discussions i linked below, having a dedicated keyword for this is not easy to achieve with minimal disadvantages.

Here are some resources for this topic:

linter
    `flutter_lints`
linter rule
    `unawaited_futures`
    https://dart-lang.github.io/linter/lints/unawaited_futures.html
discussions
    https://github.com/dart-lang/sdk/issues/46218
    https://github.com/dart-lang/language/issues/1652
    https://github.com/passsy/dart-lint/pull/33
lrhn commented 3 years ago

Thanks for reminding us. There is now an unawaited function in dart:async that you should use instead of package:pedantic.

om-ha commented 3 years ago

@Irhn unawaited from dart:async does the job well, thanks!

Perhaps having this as an auto-fix in VS Code and others is good, just like suggesting adding await or comment-ignore`

Screen Shot 2021-09-23 at 12 06 58 PM