Open EBrown8534 opened 6 years ago
I think this might be an improvement, it seems reasonable; but I don't know the full intentions of why we do a direct cast rather than check.
I guess these code assume the F# usage only (somewhat reasonable for F# records): "If a type is defined in F#, null is not permitted as a regular value unless the AllowNullLiteral attribute is applied to the type. If a type is defined in some other .NET language, null is a possible value, and when you are interoperating with such types, your F# code might encounter null values." [1]
Also value types (struct record) will not require null checks as null is not a valid value for a value type (unless represented as Nullable<'T>
but that's a different story)
In the future could these non-nullable requirement also be encoded in a more interoperable way too eg.: as a modreq?
[1] https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2012/dd233197(v=vs.110)
@TIHan That may be the case, but it's also worth noting that the end result of this code is the cast being performed 3 times in the function, when one will suffice. Granted, casting is usually quick, but there's no reason to do it that often, especially when I assume this function is used a lot by the compiler.
Looking at the docs, it looks like even the Microsoft support docs do a direct cast, not sure why, but it still feels wrong:
The role of IComparable is to provide a method of comparing two objects of a particular type. This is necessary if you want to provide any ordering capability for your object. Think of IComparable as providing a default sort order for your objects. For example, if you have an array of objects of your type, and you call the Sort method on that array, IComparable provides the comparison of objects during the sort. When you implement the IComparable interface, you must implement the CompareTo method, as follows:
// Implement IComparable CompareTo method - provide default sort order. int IComparable.CompareTo(object obj) { car c=(car)obj; return String.Compare(this.make,c.make); }
Even their example has no type-safety check. I wonder if that's because IComparable
is typically only used by collections.
I still don't like it, but I like it less now that I see that the support docs use it.
Is there a chance this can be addressed? You don't even need to call from another language for this to be a problem:
type Animal =
abstract member Name: string
inherit System.IComparable
type Dog =
{ numWoofs: int }
interface Animal with
member this.Name = "dogbert"
type Cat =
{ numMeows: int }
interface Animal with
member this.Name = "catsy"
let dog = { numWoofs = 3 }
let cat = { numMeows = 5 }
let map = Map<Animal, string> [ dog, "woof"; cat, "bahhh" ]
Results in:
System.InvalidCastException: Unable to cast object of type 'Dog' to type 'Cat'.
Granted, the key of a map is usually something simple, but I don't think this is necessarily all that uncomon a use case.
It seems to me the above is a bug different from the one in the OP. The map constructor is
new(elements: seq<_>) =
let comparer = LanguagePrimitives.FastGenericComparer<'Key>
new Map<_, _>(comparer, MapTree.ofSeq comparer elements)
and 'Key
is Animal
. The code, however, throws in Cat.CompareTo(Object obj, IComparer comp)
.
So, my conclusion is that there is something wrong in FastGenericComparer
.
But I may be completely off. I have a problem debugging right now, but will try to further look into this. Unless somebody tells me this is the wrong track.
@Martin521 Is that not because Animal
's IComparable.CompareTo
implementation just delegates to the concrete type's implementation, i.e., when IComparable.CompareTo
is ultimately called, it will walk up the inheritance tree to the implementation on Cat
, whose body will throw if its obj
argument is not in fact another Cat
(and is instead a Dog
upcast to Animal
upcast to obj
).
I.e., a call logically analogous to this is made
((IComparable)(Animal)cat).CompareTo((object)(Animal)dog)
which then reaches an implementation like this
int CompareTo(object obj, IComparer comp)
{
// `obj` may actually be a boxed `Dog` or anything else that
// implements `Animal` when invoked inside of a `Map<Animal, _>`.
Cat cat = (Cat)obj;
…
}
One thing to consider in @reinux's example is what exactly the compiler would do instead.
Manually overriding the implementation of IComparable
helps to illustrate this point:
type Animal =
abstract member Name: string
inherit System.IComparable
[<CustomComparison; CustomEquality>]
type Dog =
{ numWoofs: int }
override this.Equals obj =
match obj with
| :? Dog as other -> this.numWoofs = other.numWoofs
| _ -> false
override this.GetHashCode () = this.numWoofs
interface System.IComparable with
member this.CompareTo obj =
match obj with
| :? Dog as other -> compare this.numWoofs other.numWoofs
| _ -> failwith "What should happen here?? Cat is hierarchically unrelated to this type, and it is not even defined yet, anyway."
interface Animal with
member this.Name = "dogbert"
[<CustomComparison; CustomEquality>]
type Cat =
{ numMeows: int }
override this.Equals obj =
match obj with
| :? Cat as other -> this.numMeows = other.numMeows
| _ -> false
override this.GetHashCode () = this.numMeows
interface System.IComparable with
member this.CompareTo obj =
match obj with
| :? Cat as other -> compare this.numMeows other.numMeows
| _ -> failwith "What should happen here??"
interface Animal with
member this.Name = "catsy"
let dog = { numWoofs = 3 }
let cat = { numMeows = 5 }
let map = Map<Animal, string> [ dog, "woof"; cat, "bahhh" ]
In both cases above, what is a meaningful comparison value for the compiler to return if the other object is of a completely unrelated type? 1
? -1
? Surely not 0
?
How would that work if Cat
always returned 1
when obj
was not a Cat
, and Dog
always returned 1
when obj
was not a Dog
?
That would mean that a Cat
value and a Dog
value would have different relative orderings depending on which one you asked.
So yes, when the compiler is implementing CompareTo
for a record type, it is definitely not considering unrelated types that happen to implement the same interfaces, and it doesn't seem like it would make much sense for it to do so.
That said, the compiler does still seem to be emitting more unbox.any
s than it needs to:
.method public final hidebysig virtual
instance int32 CompareTo (
object obj,
class [System.Runtime]System.Collections.IComparer comp
) cil managed
{
.custom instance void [System.Runtime]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = (
01 00 00 00
)
// Method begins at RVA 0x20e4
// Code size 56 (0x38)
.maxstack 5
.locals init (
[0] class _/Dog,
[1] int32,
[2] int32
)
IL_0000: ldarg.1
IL_0001: unbox.any _/Dog
IL_0006: stloc.0
IL_0007: ldarg.0
IL_0008: brfalse.s IL_002c
IL_000a: ldarg.1
IL_000b: unbox.any _/Dog
IL_0010: brfalse.s IL_002a
IL_0012: ldarg.0
IL_0013: ldfld int32 _/Dog::numWoofs@
IL_0018: stloc.1
IL_0019: ldloc.0
IL_001a: ldfld int32 _/Dog::numWoofs@
IL_001f: stloc.2
IL_0020: ldloc.1
IL_0021: ldloc.2
IL_0022: cgt
IL_0024: ldloc.1
IL_0025: ldloc.2
IL_0026: clt
IL_0028: sub
IL_0029: ret
IL_002a: ldc.i4.1
IL_002b: ret
IL_002c: ldarg.1
IL_002d: unbox.any _/Dog
IL_0032: brfalse.s IL_0036
IL_0034: ldc.i4.m1
IL_0035: ret
IL_0036: ldc.i4.0
IL_0037: ret
} // end of method Dog::CompareTo
[CompilerGenerated]
public sealed override int CompareTo(object obj, IComparer comp)
{
Dog dog = (Dog)obj;
if (this != null)
{
if ((Dog)obj != null)
{
int num = numWoofs@;
int num2 = dog.numWoofs@;
return ((num > num2) ? 1 : 0) - ((num < num2) ? 1 : 0);
}
return 1;
}
if ((Dog)obj != null)
{
return -1;
}
return 0;
}
@brianrourkeboll Thanks for looking into this.
@Martin521 Is that not because
Animal
'sIComparable.CompareTo
implementation just delegates to the concrete type's implementation, i.e., whenIComparable.CompareTo
is ultimately called, it will walk up the inheritance tree to the implementation onCat
, whose body will throw if itsobj
argument is not in fact anotherCat
(and is instead aDog
upcast toAnimal
upcast toobj
).
Yes, that's what's happening. But what should we conclude?
a) Do nothing. Accept that a runtime exception is thrown by correct F# code.
b) Disallow (at compile time) Map<'Key, 'Value> when 'Key is an interface.
c) Have FastGenericComparer
return some unique ordering, not relying on delegated CompareTo
.
I don't know if b) or c) are possible, but a) seems wrong to me.
a) Do nothing. Accept that a runtime exception is thrown by correct F# code.
Hmm, I'm not sure that the example is correct code, though.
For it to be meaningful, the types need to have some kind of specified relation to each other, e.g.,
type IAnimal = abstract Name : string
type Animal =
| Dog of Dog
| Cat of Cat
and Dog =
{ numWoofs : int }
interface IAnimal with
member _.Name = "dogbert"
and Cat =
{ numMeows : int }
interface IAnimal with
member _.Name = "catsy"
let dog = Dog { numWoofs = 3 }
let cat = Cat { numMeows = 5 }
// The key must be `Animal`, not `IAnimal`, because the types are not related to each other through `IAnimal`.
let map = Map<Animal, string> [ dog, "woof"; cat, "bahhh" ]
Otherwise, it's equivalent to doing something like this — which compiles, but which likewise fails at runtime, because each type's IComparable
implementation has no sensible way to give a meaningful result when presented with a value of a completely unrelated type:
open System
let s = Set<IComparable> [1; "1"; DateOnly (2024, 07, 30)]
// System.ArgumentException: Object must be of type String.
// at System.String.CompareTo(Object value)
// at Microsoft.FSharp.Collections.SetTreeModule.add[T](IComparer`1 comparer, T k, SetTree`1 t) in D:\a\_work\1\s\src\FSharp.Core\set.fs:line 139
// at Microsoft.FSharp.Collections.SetTreeModule.ofSeq[a](IComparer`1 comparer, IEnumerable`1 c) in D:\a\_work\1\s\src\FSharp.Core\set.fs:line 694
// at Microsoft.FSharp.Collections.FSharpSet`1..ctor(IEnumerable`1 elements) in D:\a\_work\1\s\src\FSharp.Core\set.fs:line 970
b) Disallow (at compile time) Map<'Key, 'Value> when 'Key is an interface.
I don't think that would make sense, because of course (using the same example) all of the objects implementing a given interface could be of the same type. It's just that unifying types at compile time to (non-generic) IComparable
is inherently unsafe (unlike unifying two values of compatible types to IComparable
at runtime, which is fine, and which is what happens if just Cat
is used as a map key, etc.).
open System
// Safe, and the compiler will ensure that it is.
let s = Set<int> [1; 2; 3]
// Happens to be safe, but the compiler cannot and will not help us if it isn't.
let s' = Set<IComparable> [1; 2; 3]
// Not safe... But the compiler has no choice.
// We're literally telling it: "trust us;
// the values may be different types, but they
// implement the same interface, and we swear
// that the interface implementation for each
// individual type knows how to handle this situation
// in a sensible way."
// (It does not.)
let s'' = Set<IComparable> [1; 2; "3"]
I guess there are a couple things that make this less obvious in the animal example:
IComparable
for Cat
and Dog
is invisible.IComparable
inside of Map
is invisible. The user is focused on the upcast to the Animal
interface, not on the fact that there will then be another upcast to IComparable
— even though the inherit IComparable
declaration literally implies that any Animal
may be treated directly as an IComparable
.Otherwise, it's equivalent to doing something like this — which compiles, but which likewise fails at runtime, because each type's
IComparable
implementation has no sensible way to give a meaningful result when presented with a value of a completely unrelated type
I came to that conclusion when I thought about implementing IComparable myself -- the only thing that stopped me was the length of the resulting code, as demonstrated by @brianrourkeboll in his comment.
My plan was to use System.Type.GUID
, which does implement IComparable.
Semantically, this doesn't make all that much sense since it's just a random ID, but I do think it would enable using data structures that require comparison. A maliciously-crafted assembly with a colliding GUID type might be cause for concern, so that might warrant some investigation first.
For my current use case, I ended up just using List.find
instead, since I don't anticipate O(n) being too much of an issue. I hope!
@reinux
Semantically, this doesn't make all that much sense since it's just a random ID, but I do think it would enable using data structures that require comparison.
Since it doesn't sound like you actually need meaningful comparison for your use-case, what about just using a hash/equality-based dictionary? The compiler-generated implementation of equality works fine for this.
open System.Linq
open System.Collections.Generic
type Animal =
abstract member Name: string
type Dog =
{ numWoofs: int }
interface Animal with
member this.Name = "dogbert"
type Cat =
{ numMeows: int }
interface Animal with
member this.Name = "catsy"
let dog = { numWoofs = 3 }
let cat = { numMeows = 5 }
let map : Dictionary<Animal, string> = [ struct (dog :> Animal, "woof"); struct (cat :> Animal, "bahhh") ].ToDictionary ()
Or, using a little library I made:
#r "nuget: FSharp.Collections.Builders"
open FSharp.Collections.Builders
type Animal =
abstract member Name: string
type Dog =
{ numWoofs: int }
interface Animal with
member this.Name = "dogbert"
type Cat =
{ numMeows: int }
interface Animal with
member this.Name = "catsy"
let dog = { numWoofs = 3 }
let cat = { numMeows = 5 }
let map = dictionary<Animal, _> { dog, "woof"; cat, "bahhh" }
I am still not convinced.
Yes, looking from inside the system, one can explain why cats and dogs cannot be compared.
But as a user, I think it is reasonable to expect that Set<Animal>
(where Animal
fulfills the required comparison constraint) is what is says: a set of Animals, and you should be able to put any animal into it.
I therefore still wonder if option c) of my comment above is such a bad idea. (Here is a very naive but working version.)
(Here is a very naive but working version.)
Interesting. Some comments about that specific approach:
I appreciate the outside-the-box thinking, but I think that the current behavior is the only correct behavior, in spite of the unfortunate unsafety it enables. Once the comparison is no longer based purely on the type's structural components, it is no longer really structural comparison.
compare ``🍎`` ``🍊``
Ultimately, I believe that comparisons like this must fail, because no meaningful comparison exists, unless the developer implements IComparable
explicitly and either:
To put it in more proverbial terms — is there really anything meaningful that the compiler could do to make the following code not throw?[^2]
open System
type IFruit = inherit IComparable
type Apple = Apple with interface IFruit
type Orange = Orange with interface IFruit
compare<IFruit> Apple Orange
> If a code file does not begin with a top-level module declaration or a namespace declaration, the whole contents of the file, including any local modules, becomes part of an implicitly created top-level module that has the same name as the file, without the extension, with the first letter converted to uppercase.
[^2]: Contrast that with the following, which will run and produce false
. Just because we can say that two objects are not equal does not mean we can give them a meaningful relative ordering.
box Apple = box Orange
Ok, I think I have to give in regarding c). Thanks for your persistance and all the details.
Still, the current behavior (a runtime exception when putting apples and oranges into a Set
comparison
constraint stronger (requiring an implemented IComparable at the level of the type argument). (I think this would have been a good solution in F# 1.0)@Martin521 Hmm, it's probably theoretically possible to write an analyzer to emit such a warning in at least some cases by inspecting the typed tree.
It may not be easy, though, especially if you want to avoid false positives.
For example, you'd probably want to allow this, since it's perfectly valid:
let xs : Set<IComparable> = set [1; 2; 3]
Whereas you'd want to warn on this:
let xs : Set<IComparable> = set [1; 2; "3"]
But what about these?
let f (xs : Set<IComparable>) = xs |> Set.add "3"
let xs : Set<IComparable> = set [1; 2]
f xs // This throws.
f (Set<IComparable> [1; 2]) // This throws.
f (Set<IComparable> ["1"; "2"]) // This is fine.
I guess it would probably be easier to just write an analyzer that always warns whenever a type parameter with the comparison
constraint is instantiated to an interface type, and people could choose to enable it if they wanted the warning.
I guess it would probably be easier to just write an analyzer that always warns whenever a type parameter with the
comparison
constraint is instantiated to an interface type, and people could choose to enable it if they wanted the warning.
Yes, that's what I had in mind. Question is, of course, if the effort is well spent on this rare case. Anyway, I might tackle it at some point just to learn about analyzers.
NOTE TO READERS: all of the above discussion is about this comment. The original issue is still open.
I might be missing something, but I'm not sure why something like c) wouldn't be the best approach.
I think apples and oranges can be meaningfully compared -- just that all apples are infinitely distant from all oranges in any given dimension, which is a notion that's well-captured by comparing any comparable, unique identifier of the types themselves prior to comparing the actual values.
For my use case, a list was adequate. For others, ImmutableDictionary may work just as well. But for crank-the-world type applications, e.g. Elmish models, you really want to be able to do incremental updates, which means trees, which means comparisons.
With that approach, changing any part of a type's fully-qualified name could affect its runtime behavior, whether renaming the type or cutting some code out of one module and pasting it into another.
This is why I was considering System.Type.Guid
, actually. Although, to be fair, I'm not sure what the lifetime of the GUID is either.
At the same time, I can't think of a practical consequence if it does turn out to be ephemeral. If it's semantically nonsensical to compare Dogs to Cats, then there almost certainly isn't any code that would depend on any persistence either.
Not that it's directly pertinent, but record comparisons in F#, if I understand correctly, are already nominal before they're structural. You can't compare, say, a value of type A = { x: int }
to another of type B = { x: int }
. So I'm not sure there's any damage done by deviating from theory somewhat. Compile-time warnings/errors would be one thing, but there isn't really anything theoretically sound about throwing an exception anyway.
When generating the
CompareTo(object, IComparer)
for a record type, there is no type-safety for casting. Instead, theCompareTo(object, IComparer)
simply does a direct-cast to the record type. This means callers from C#/VB.NET toCompareTo(object, IComparer)
will get anInvalidCastException
if they send an object that is not the correct type.Repro steps
F# Code:
C# Code:
The generated ILASM for the
TestType::CompareTo
:The obvious problem here is the unconditional unboxing of the argument to the
FsLibrary.TestType
. There is no attempt made to test the type for compatibility.Additionally, this unboxing is always done, is done several times, and is not saved until the branch for which it is needed (IL_001e).
This can be more easily seen on
sharplab.io
:https://sharplab.io/#v2:DYLgZgzgPgLgngBwKYAIAqSIzY1BeAWACgVSUBvFAOQEMBbVEFLAJwEsA7AcxQF9igA=
The generated C# for
CompareTo(object, IComparer)
is:The culprit here can be easily identified as
TestType testType = (TestType)obj
, as well as the location of that cast, the fact that we castobj
toTestType
in two other locations (rather than using the previously-casted version).Expected behavior
The CompareTo(object, IComparer) should be generated with a type-check, then if the type is compatible it should be casted. It should also do the cast only when necessary, rather than as the first step.
Ideally, a more-proper C# version would look like:
Actual behavior
The generated CompareTo(object, IComparer) is not properly compatible with other .NET languages.
Known workarounds
None. The C# version must implement it's own comparison.
Related information