dotnet / fsharp

The F# compiler, F# core library, F# language service, and F# tooling integration for Visual Studio
https://dotnet.microsoft.com/languages/fsharp
MIT License
3.92k stars 786 forks source link

Potential Set equality regression #14598

Closed nojaf closed 1 year ago

nojaf commented 1 year ago

Hello, I believe I may have a encountered a quite recent regression with Sets.

Repro steps

Create script:

#r "nuget: NUnit, 3.13.3"

open System.Collections.Generic
open NUnit.Framework

let a : int Set = set (HashSet(0))
Assert.AreEqual(Set.empty, a)

Expected behaviour

Running this with the 7.0.102 SDK works out fine. No exception is thrown.

Actual behaviour

When I run this with my latest local compiler, I got:

dotnet C:\Users\nojaf\Projects\fsharp\artifacts\bin\fsi\Debug\net7.0\fsi.dll .\set-problem.fsx

NUnit.Framework.AssertionException:   Expected is <Microsoft.FSharp.Collections.FSharpSet`1[System.IComparable]>, actual is <Microsoft.FSharp.Collections.FSharpSet`1[System.Int32]>

   at NUnit.Framework.Assert.ReportFailure(String message)
   at NUnit.Framework.Assert.ReportFailure(ConstraintResult result, String message, Object[] args)
   at NUnit.Framework.Assert.That[TActual](TActual actual, IResolveConstraint expression, String message, Object[] args)
   at NUnit.Framework.Assert.AreEqual(Object expected, Object actual)
   at <StartupCode$FSI_0002>.$FSI_0002.main@() in C:\Users\nojaf\Projects\set-problem.fsx:line 7
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)

Related information

@T-Gro maybe this is related to https://github.com/dotnet/fsharp/commit/a9c194aac6ef8951ad0526aad237207838a8351f but that is a wild guess at this point.

Provide any related information (optional):

nojaf commented 1 year ago

Oh and the workaround is to be explicit about the type in Set.empty. Assert.AreEqual(Set.empty<int>, a)

T-Gro commented 1 year ago

at NUnit.Framework.Assert.AreEqual(Object expected, Object actual)

This does not end up calling the existing equality from F#, you can see that the stack trace is fully "at Nunit". In your workaround, you actually force nunit to use the generic overload, which then calls the implemented equality.

The "object,object" overload is doing some magic. Also, doing any of the "collection asserts" works fine here, it is really only this one.

My PR introduced implementation of IStructuralEquatable (but it is not getting called), my guess is that nunit is trying to reveal that on the "object,object" code path and then does it's own thing.

T-Gro commented 1 year ago

Some more details. Initially I saw the decision to take the AreEqual(obj,obj) overload as a bug. But testing this in isolation, and reading more about it, I think it makes sense - F# tries to delay the specialization. So if there is an overload which restricts the type argument, and one that doesn't => it is preferred. Doing it other way around could break code below it by being too restricted already. Also, explicit annotation is always possible:

type MyWeirdness =
    static member AreSame(o1:obj,o2:obj) = $"Obj overload chosen, o2 is {o2.GetType().FullName}"
    static member AreSame<'T>(o1:'T,o2:'T) = "'T overload chosen"

let testThisOut() =
    let o1 = Set.empty<int>
    let o2 = Set.empty
    printfn "%s" (MyWeirdness.AreSame(o1,o2))

testThisOut() // Prints Obj overload chosen, o2 is Microsoft.FSharp.Collections.FSharpSet`1[[System.IComparable,

This also corresponds to explanation here: https://stackoverflow.com/a/14623807

I will now look and nunit's source and see how it trets the obj,obj case.

T-Gro commented 1 year ago

Ok found it.

nunit has a chain of comparers for the non-generic AreEqual(obj,obj) case. Each of them attempts some casting and if it succeeds, the chain stops.

With this PR, IStructuralEquatable was introduced and has preference over IEnumerable https://github.com/nunit/nunit/pull/3646/files#diff-d2c6f18471d8cdb14177803c4c966701818b5004bd832c8dd63050a181c4381b

When F# set did not have IStructuralEquatable interface, IEnumerable -based comparison kicked in and for that two empty sequences are equal even if they are of different T. Since it now has IStructuralEquatable , that is applied first.

We could alter the IStructuralEquatable implementation in F# to also allow success in case of different types, i.e. allow comparing e.g. Set and Set even. The interface works with a passed comparer, and it would be up to the comparer to compare objects even of unrelated types.

Checking the codebase though, this would not be very F# idiomatic. Also in other collections, we only allow successful Equals (=true) for collections of same types.

Only exception is array, where the following three indeed equal each other when compared using the IStructuralEquatable interface with System.Collections.Generic.EqualityComparer.Default provided.

      (Array.empty) :> System.Collections.IStructuralEquatable
      (Array.empty<string>) :> System.Collections.IStructuralEquatable
      (Array.empty<int>) :> System.Collections.IStructuralEquatable

Note that the same does not hold for Set,List,Map,.. or any other F#-generated implementation.

T-Gro commented 1 year ago

@dsyme , @vzarytovskii :

At least for Set<> I could bring in an implementation that would try to find equality (in the IStructuralEquatable codepath) even for collections of different types, like it works for Arrays. Would be defintely more difficult to do it in the compiler generated code for all DUs, incl. recursive ones like list).

The most obvious examples are like the one above - should empty collections of different typar be structurally equal to each other or not?

T-Gro commented 1 year ago

To have another use case besides empty collections (as in - should empty collections for different typarg be considered structurally equal or not), here is slightly less trivial scenario:

Should collections holding one value: 2.0 , 2.0f and 2.0m respectively be considered structurally equal.

let areEqualInNunit (o1,o2) =
    try
        Assert.AreEqual(o1,o2)
        true
    with _ -> false

printfn $"Float<>double in an array= {areEqualInNunit([|2.0f|], [|2.0|])}" 
printfn $"Float<>decimal in an array= {areEqualInNunit([|2.0f|], [|2.0m|])}" 
printfn $"Double<>decimal in an array= {areEqualInNunit([|2.0|], [|2.0m|])}" 

printfn $"Float<>double in a list = {areEqualInNunit([2.0f], [2.0])}" 
printfn $"Float<>decimal in a list = {areEqualInNunit([2.0f], [2.0m])}" 
printfn $"Double<>decimal in a list= {areEqualInNunit([2.0], [2.0m])}" 

Right now for Array they are, for List (and for Set after my latest change) they are not because F# comparison code typically takes a shortcut after typecheck, and returns false if the types being compared are not the same.

T-Gro commented 1 year ago

( this is the "typical F# code" scenario I am talking about. Affects both Fsharp.Core types where this is implemented by hand, as well as compiler generated code for DUs like list: When types do not match, false is returned. image

The opposing alternative is to do best-attempt equality, staying untyped (working with obj) and calling into the supplied comparer with untyped objects )

charlesroddie commented 1 year ago

Intuitively, I wouldn't expect the code to compile, since Set.empty: obj cannot infer the type parameter. If it must compile, I would expect it to be unlikely to evaluate to true, and more likely to evaluate to false, since the type parameter is unlikely to evaluate to int.

Edit: On testing this, Set.empty: obj does compile, which seems like a bug, and the type inside the obj is set<IComparable>. Given that it compiles, the behaviour seems conditionally correct since set<IComparable> is not identical to set<int>.

It would be nice to enable a warning when anything is inferred to be of type obj, or passed into a function as an obj. In the absence of this, we use a test that bans the string "Assert.AreEqual" in .fs files.

Smaug123 commented 1 year ago

It would be nice to enable a warning when anything is inferred to be of type obj

:sob: I've been trying to fix https://github.com/dotnet/fsharp/pull/13298 on-and-off for ages and ages, but I couldn't get the tests passing.

nojaf commented 1 year ago

Is this one still on the radar to get fixed in dotnet 8?

T-Gro commented 1 year ago

There is no clear/correct path for "fixing" the behavior apart from the warning of obj being infered, which is the real root cause. Now that the optional warning is in, it should trigger for the code snippet you posted (that is, Set.empty being infered to be Set.empty<IComparable> since Assert.AreEqual does have an overload working with objects).

Could you try that please?

(another idea would be a project-specified way of banning certain dangerous APIs, e.g. via a text file. In this case the untyped Assert.AreEqual(obj,obj). Right now we do not have an vehicle for that though)

Smaug123 commented 1 year ago

For what it's worth, at work we usually write strongly typed wrappers around this stuff (basically extending what FsUnitTyped does).

T-Gro commented 1 year ago

The FS3559 to warn about 'obj' has landed, but unfortunately it does not work for Set.empty. It does fork List or Array, there I tested it and then warning is properly raised.

However, when using Set<>, the infered type is not obj, but rather an "a' which supports comparison" ;; which is a different thing and not covered by the warning.

Sample code for testing:


module Assert =
    let AreEqual (expected : obj, actual : obj) = ()

let a = Set.empty
Assert.AreEqual(Set.empty, a)

Conceptually, I am on the side of @Smaug123 for not using the 'obj' APIs, that clearly mitigates the problem. A tooling-supported solution might also be a generic "banned APIs" analyzer where a project could specify a .txt file with banned-API per line.

I do not think that "a' with comparison" should deserve a special treatment in the compiler inference, also the product of all possible constraints might be too large.

What do you all think?

T-Gro commented 1 year ago

That being said, having the warning for List.empty in the same context is already a big win!

Smaug123 commented 1 year ago

Ah of course, I hadn't thought of that 🤦 yeah, maybe this is in the realm of analysers rather than the compiler.

T-Gro commented 1 year ago

Tracking in a new 'for analyzers` bag, closing this here.