dotnet / fsharp

The F# compiler, F# core library, F# language service, and F# tooling integration for Visual Studio
https://dotnet.microsoft.com/languages/fsharp
MIT License
3.89k stars 783 forks source link

Parenthesized/double unit literal `()` → `(())` required in certain scenarios #16254

Open brianrourkeboll opened 11 months ago

brianrourkeboll commented 11 months ago

A parenthesized unit literal ()(()) is required as an argument pattern when overriding/implementing a generic method where unit is the generic type argument.

Repro steps

Provide the steps required to reproduce the problem:

  1. Define an abstract method whose single argument is a generic type (abstract M : 'T -> …).
  2. When implementing or overriding the method when the generic type parameter is unit, the argument pattern must be (()); using only () yields error FS0768: The member 'M' does not accept the correct number of arguments. 1 argument(s) are expected, but 0 were given.

Single () usually means unit

type C = abstract M : unit -> unit
let _ = { new C with override _.M () = () }

Double (()) is required when overriding/implementing a generic method where unit is the generic type argument

Double (()) compiles:

type C<'T> = abstract M : 'T -> unit
let _ = { new C<unit> with override _.M (()) = () }

but not ():

let _ = { new C<unit> with override _.M () = () }
------------------------------------^^^^^^

stdin(3,37): error FS0768: The member 'M' does not accept the correct number of arguments. 1 argument(s) are expected, but 0 were given. The required signature is 'C.M: 'T -> unit'.

even though this is fine:

type C<'T> = abstract M : 'T * 'T -> unit
let c = { new C<unit> with override _.M ((), ()) = () }

as is

type C<'T> = abstract M : 'T -> 'T -> unit
let c = { new C<unit> with override _.M () () = () }

Expected behavior

() should mean unit everywhere.

Actual behavior

(()) is required to represent unit in certain scenarios.

Known workarounds

N/A.

Related information

.NET SDK 8.0.100-rc.2.23502.2 (and probably going far back—I remember running into this in years past).

brianrourkeboll commented 11 months ago

@psfinaki Hmm, it looks like by "linking" this to #16262, this issue will be closed when that PR is merged:

image

image

But (as you know) that PR doesn't actually address the inconsistency in the compiler; it just works around it for the unnecessary parentheses analyzer and code fix. Same with #16257.

0101 commented 11 months ago

Since () also means calling a method with no arguments, this could be complicated to fix. Since this is quite the corner case maybe it's enough to improve the error message.

brianrourkeboll commented 11 months ago

Hmm, it looks like this is where it happens:

https://github.com/dotnet/fsharp/blob/75cd8c7a3925cfb15cc03afcac243903fef20800/src/Compiler/Checking/CheckExpressions.fs#L6752-L6794

specifically

https://github.com/dotnet/fsharp/blob/75cd8c7a3925cfb15cc03afcac243903fef20800/src/Compiler/Checking/CheckExpressions.fs#L6760

and

https://github.com/dotnet/fsharp/blob/75cd8c7a3925cfb15cc03afcac243903fef20800/src/Compiler/Checking/CheckExpressions.fs#L6790-L6791

My guess is that some kind of tweak could be made to the filter above or in here (the callsite of the above) to account for unit:

https://github.com/dotnet/fsharp/blob/75cd8c7a3925cfb15cc03afcac243903fef20800/src/Compiler/Checking/CheckExpressions.fs#L6912-L6927

Although we'd obviously need to make sure any change there didn't break anything else.

vzarytovskii commented 11 months ago

@brianrourkeboll What's the proposed solution?

Specifically, what happens in the following scenario if we decide that () is equivalent to () in case of generic method.

type C<'T> =
    abstract M : unit -> unit
    abstract M : 'T -> unit
let _ =
  { 
    new C<unit> with
      override _.M () = ()
      override _.M (()) = ()
  }
brianrourkeboll commented 11 months ago

@brianrourkeboll What's the proposed solution?

Specifically, what happens in the following scenario if we decide that () is equivalent to () in case of generic method.

type C<'T> =
    abstract M : unit -> unit
    abstract M : 'T -> unit
let _ =
  { 
    new C<unit> with
      override _.M () = ()
      override _.M (()) = ()
  }

@vzarytovskii That's a good callout, although the same problem already exists for any other type, as in

> type C<'T> =
-     abstract M : unit -> unit
-     abstract M : 'T -> unit
- let _ =
-   {
-     new C<int> with
-       override _.M (_ : int) = ()
-       override _.M (_ : int) = ()
-   };;

    {
  --^

stdin(14,3): error FS0359: More than one override implements 'M: int -> unit'

The only way to disambiguate between the two overloads in such a case would be to use named arguments, but that applies regardless of argument type.

vzarytovskii commented 11 months ago

@brianrourkeboll What's the proposed solution? Specifically, what happens in the following scenario if we decide that () is equivalent to () in case of generic method.

type C<'T> =
    abstract M : unit -> unit
    abstract M : 'T -> unit
let _ =
  { 
    new C<unit> with
      override _.M () = ()
      override _.M (()) = ()
  }

@vzarytovskii That's a good callout, although the same problem already exists for any other type, as in

> type C<'T> =
-     abstract M : unit -> unit
-     abstract M : 'T -> unit
- let _ =
-   {
-     new C<int> with
-       override _.M (_ : int) = ()
-       override _.M (_ : int) = ()
-   };;

    {
  --^

stdin(14,3): error FS0359: More than one override implements 'M: int -> unit'

The only way to disambiguate between the two overloads in such a case would be to use named arguments, but that applies regardless of argument type.

I guess the only thing I'm concerned about is that if we change how method resolution works, we need to make sure that something which was compiling before (and choosing a correct overload) still compiles and works as before.

brianrourkeboll commented 11 months ago

@brianrourkeboll What's the proposed solution? Specifically, what happens in the following scenario if we decide that () is equivalent to () in case of generic method.

type C<'T> =
    abstract M : unit -> unit
    abstract M : 'T -> unit
let _ =
  { 
    new C<unit> with
      override _.M () = ()
      override _.M (()) = ()
  }

… I guess the only thing I'm concerned about is that if we change how method resolution works, we need to make sure that something which was compiling before (and choosing a correct overload) still compiles and works as before.

Hmm, you are right that your example currently compiles and might not if this oddity were "fixed," depending on the fix. It is a pretty unique scenario, though, since the () versus (()) disambiguation technique only works for this specific generic–non-generic overload pair.

It looks like the mismatch happens here:

image

Where M in bindKeys has arity 0, but M in virtNameAndArityPairs has arity 1.

The 0 appears to come from here:

image

Specifically:

I think in theory we could say that 1 ≈ 0 when the binding's argument type was instantiated to unit (via the type instantiation in implTy), although it would take me a bit to figure out the best way to tie that info together from the values in context.

We could even keep the special example you gave compiling by searching for the existence of a non-generic method with the same signature and keeping the existing behavior for that... But that would admittedly be pretty ugly.

brianrourkeboll commented 11 months ago

Basically the existing code that does (or does not do) "single unit elimination" just doesn't handle the generic scenario:

https://github.com/dotnet/fsharp/blob/9e357ca5f82f5e800b259cbda7d91507b41d2e9a/src/Compiler/SyntaxTree/SyntaxTreeOps.fs#L352-L354

() alone is actually parsed as SynPat.Paren (SynPat.Const (SynConst.Unit, _), _) in method argument position, so (()) (or ((()))…) is not converted to SynSimplePats.SimplePats ([], [], m) there, which means that it is not eliminated as an argument here:

https://github.com/dotnet/fsharp/blob/9e357ca5f82f5e800b259cbda7d91507b41d2e9a/src/Compiler/SyntaxTree/SyntaxTreeOps.fs#L627-L635

That's why the double (()) does currently work here for abstract M : 'T -> unit where 'T is instantiated to unit, because the following code is not aware of the instantiation of 'T to unit, and unit is never eliminated from the slot:

https://github.com/dotnet/fsharp/blob/9e357ca5f82f5e800b259cbda7d91507b41d2e9a/src/Compiler/TypedTree/TypedTreeOps.fs#L2689-L2701

Indeed, an override of abstract M : 'T -> unit, when 'T is instantiated to unit, is ultimately compiled as a method taking a unit argument, not as a parameterless method:

internal sealed class o@4 : C<Unit>
{
    void C<Unit>.M(Unit _arg1)
    {
    }
}

The reason single () in the implementation does not work to fill the abstract M : 'T -> unit slot because the unit argument of the implementation is eliminated earlier in the code. That means that the arity of the slot and of the implementation don't line up here:

https://github.com/dotnet/fsharp/blob/9e357ca5f82f5e800b259cbda7d91507b41d2e9a/src/Compiler/Checking/CheckExpressions.fs#L6913-L6929

It is perhaps noteworthy that the compiler actually avoids this problem altogether when the return type is generic and you attempt to instantiate 'T to unit—the following doesn't compile:

> type C<'T> = abstract member M : 'T -> 'T
- let _ = { new C<unit> with member _.M (()) = () }
- ;;

  let _ = { new C<unit> with member _.M (()) = () }
  ------------------------------------^

stdin(2,37): error FS0017: The member 'M: unit -> unit' is specialized with 'unit' but 'unit' can't be used as return type of an abstract method parameterized on return type.

Anyway, I think in theory this could be worked around, but I'm not personally sure it's worth the work or added complexity in the compiler, unless we planned to take a fresh approach to unit-elimination where it was handled more consistently throughout.