dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.07k stars 1.56k forks source link

Invalid `main` accepted by the front end #56122

Open osa1 opened 2 months ago

osa1 commented 2 months ago
void main<T>() {
  print(T);
  print('hi');
}

According to the spec section 19.6, for main to be called executed it needs to have zero, one, or two arguments.

If a library exports main but it's not a script then it's a compile time error.

However the program above is accepted by the front end.

How it works today is VM runs it by doing a dynamic invocation: https://github.com/dart-lang/sdk/blob/6535a017aef4a2588470d36ffd5c977d9fa83339/sdk/lib/_internal/vm/lib/isolate_patch.dart#L297

Which passes dynamic for the missing type arguments.

Instead this program should be rejected by the front end, per spec.

mkustermann commented 2 months ago

/cc @johnniwinther

lrhn commented 2 months ago

TL;DR: It's not wrong. It's also not useful, and we should be free to stop supporting it.

The language spec says that a library is a script if it exports "a top-level function declaration named main that has either zero, one or two required arguments."

I remember us making a change to that, making it a compile-time error for any Dart library to declare a top-level main declaration that is not a valid script main ... and found it. Most recent spec of main is the NNBD spec:

Let L be a library that exports a declaration D named main. It is a compile-time error unless D is a non-getter function declaration. It is a compile-time error if D declares more than two required positional parameters, or if there are any required named parameters. It is a compile-time error if D declares at least one positional parameter, and the first positional parameter has a type which is not a supertype of List<String>.

Implementations are free to impose any additional restrictions on the signature of main.

This example satisfies those requirements, but we are free to add more (like "no type parameters").

It goes on to say that:

If main can be called with two positional arguments, it is invoked with the following two actual arguments: ...

If main cannot be called with two positional arguments, but it can be called with one positional argument, it is invoked with an object whose run-time type implements List<String> as the only argument.

If main cannot be called with one or two positional arguments, it is invoked with no arguments.

This is pretty much lifted from the current dartLangSpec.tex, and it doesn't specify what "can be called one argument" means. It can be any of:

The first two look equivalent, but aren't because of type parameters. The last two probably are equivalent if the type parameter can be instantiated to bounds. We should decide what we want here.

The spec then says:

In each of the above three cases, an implementation is free to provide additional arguments allowed by the signature of main (the above rules ensure that the corresponding parameters are optional). But the implementation must ensure that a dynamic error occurs if an actual argument does not have a run-time type which is a subtype of the declared type of the parameter.

That means it must be able to do a dynamic invocation if the second argument is passed, but it only ever is for scripts run with Isolate.spawnUri. Anything else can just check that the second parameter accepts null.

If we take the spec literally, and assume no further restrictions have been added, the only question is "what happens when you invoke void main<T>() {...} with no arguments", because all choices until there covered precisely. The specification doesn't check whether the function can be called with no arguments, it just does it, assuming that it can since the function must have zero, one or two required parameters, and the prior checks should have ruled out one or two.

The behavior, unsurprising given the age of this behavior, is to do a dynamic invocation, which means instantiating any type parameters to bounds.

So what happens if instantiating to bounds fails:

 class C<T extends C<T>> {}
 void main<T extends C<T>>() {
   print(T);
 }

Let's get back to "can be called with one argument". Can void main<T>(List<String> args) be "called with one argument"? Arguably it can, since main([]) works, and so does (main as dynamic)(["a"]).

The argument it's called with will be a List<String>, and if the parameter doesn't accept that, the spec doesn't say anything, so presumably a runtime error. In reality, the compilers reject the script early, and the analyzer warns about it, so we do check that. (If not, it would have supported the call being a dynamic invocation. There is no other way in the language to invoke a function with an argument of an invalid type. But we don't for the first argument.)

Behavior is more fun here:

So we can conclude that a void main<T>(List<String> args) {} doesn't work in any of our compilers, whereas void main<T>() {} does because it falls through to the "no checks made" case, then does instantiate to bounds (or not).

We probably don't want to require the invocation of main to be a dynamic invocation. Dynamic invocations are slow, and start-up should be fast. That suggests doing type checking (which is likely what we do), which again suggests not allowing type parameters.

I suggest we follow the spec, in particular:

Implementations are free to impose any additional restrictions on the signature of main.

and disallow generic main functions.

If the type of main is not a subtype of one of void Function(), void Function(List<String>) or Function(List<String>, Never), the library is not a script. If a script compiled as a program entry point (not an entry point for Isolate.spawnUri), the last type can be narrowed to Function(List<String>, Null), because it will be invoked with null as second argument.

We can update the specification too, to say "not generic", so that it's documented, but since the spec gives a blanket permission to restrict, we don't technically have to.

eernstg commented 2 months ago

I'd prefer to specify that being generic is an error for the main function that makes a script a script. It's a disservice to users of the language and tool chain if there are many tool specific compilation errors. So let's just avoid them when possible, especially if they are identical for all tools.

bwilkerson commented 2 months ago

I'd prefer to specify that being generic is an error for the main function that makes a script a script.

It sounds like there's an implication that there are main functions that aren't entry points (don't make a script a script). If that's not the case then everything below this can be ignored.

Compilers and the VM are typically handed the defining compilation unit of the library containing the main function that makes a script a script.

That isn't the case for the analyzer. It isn't clear to me how the analyzer would know which main functions are entry points and which aren't. Today the analyzer assumes that every main function is potentially an entry point and places the same requirements on them all.

So I'm hoping you mean that it should be an error for any top-level function named main to be generic.

eernstg commented 2 months ago

Right, it should be an error for any top-level function named main to be generic.