dotnet / csharplang

The official repo for the design of the C# programming language
11.41k stars 1.02k forks source link

Champion "Support for == and != on tuple types" (15.7) #190

Open gafter opened 7 years ago

gafter commented 7 years ago

Summary

Support == and != on tuples. For example (x, y) == (1, 2) (which would be equivalent to x == 1 && y == 2).

See also https://github.com/dotnet/roslyn/issues/13155

LDM history:

Started prototype at tuple-equality

Thaina commented 7 years ago

Should just support extension operator on IEquatable<T>

gafter commented 7 years ago

@Thaina No, == and .Equals are different things. Try double.NaN for example. Also, the == operator would convert the values in a tuple, elementwise, to a common type for comparison purposes, while .Equals requires that they be the same type.

DavidArno commented 7 years ago

Retracting my up-vote for this proposal. As I read "No, == and .Equals are different things", I do not just witness the principle of least astonishment being violated, it's being smashed into tiny pieces and ground into the dirt.

Yes, for historical reasons, we have ended up with multiple ways of testing for equality and this is one of those "we are stuck with it" things that can't be undone. But perpetuating this messy situation by having new types like ValueTuple<...> implement different behaviour for == and .Equals is pure madness IMO.

We would be better off with the team just leaving this topic well alone and implementing the "extension everything" proposal instead. Then 3rd party libraries can implement == and .Equals as the same thing and the great disturbance in the force will fade away...

ig-sinicyn commented 7 years ago

@DavidArno

We would be better off with the team just leaving this topic well alone and implementing the "extension everything" proposal instead. Then 3rd party libraries can implement == and .Equals as the same thing and the great disturbance in the force will fade away...

so, double.Nan == double.Nan should be false but (double.Nan, double.Nan) == (double.Nan, double.Nan) will be true, false, NaN or a tiny pink elephant depending on a developer's preferences? Pure evil 👍

HaloFour commented 7 years ago

Tuples being just a loosely bound sequence of values I think that it makes more sense to be able to compare those values irrespective of the container.

(int, int) tuple1 = ...;
(int, int) tuple2 = ...;

if (tuple1 == tuple2) { ... }
// the same as
if ((tuple1.Item1 == tuple2.Item1 && tuple1.Item2 == tuple2.Item2)) { ... }
DavidArno commented 7 years ago

@ig-sinicyn,

In the following code, result should be true:

var x = double.Nan;
var y = double.Nan;
var result = (x == y) == x.Equals(y);

But it's false as double implements == and .Equals differently.

You appear to be proposing that, just because the CLR and language teams couldn't agree on whether NaN is the same as NaN, then tuples equality should perpetuate this nonsense. Having (double.Nan, double.Nan) == (double.Nan, double.Nan) would be a tiny price to pay for equality sanity in tuples.

DavidArno commented 7 years ago

@HaloFour,

struct S { }

(S, S) tuple1 = ...;
(S, S) tuple2 = ...;

if (tuple1.Item1 == tuple2.Item1 && 
    tuple1.Item2 == tuple2.Item2) // oh dear: compiler error

ValueTuple<T1, T2> cannot implement == as the above as it's not constrained to T1 and T2 being classes.

HaloFour commented 7 years ago

@DavidArno

Yes, and I have no problem with that. I don't think that the fact that the two values are contained within a tuple should matter.

orthoxerox commented 7 years ago

@DavidArno Did you know that comparison operators and CompareTo produce different results on doubles as well? That's not something that .Net invented either. Comparison and total order operations are defined differently on doubles by IEEE. Equals matches the behavior of CompareTo.

DavidArno commented 7 years ago

@orthoxerox, @HaloFour, @ig-sinicyn,

So you have all done a lot of "no, no, can't be .Equals() and @gafter has ruled out the IEquatable<T>/EqualityComparer route and == on T1 etc can't be used. So what solution is left?

HaloFour commented 7 years ago

@DavidArno

The solution is that the developer has to use Equals manually rather than relying on == or !=, which is the same answer as when tuples aren't involved.

orthoxerox commented 7 years ago

@DavidArno the compiler has to rewrite the left == right expression as (left.Item1 == right.Item1 && left.Item2 == right.Item2). If one of the types is a struct without ==, it will be a compiler error.

DavidArno commented 7 years ago

@orthoxerox,

The following would have to be a compiler error therefore, as == in tuples would be a compiler trick, rather than a static operator on the type and so it cannot determine the type at compile-time:

bool F<T1,T2>((T1, T2) x, (T1, T2) y) => x == y;
orthoxerox commented 7 years ago

@DavidArno it's an error now and I don't see how it is related to equality of tuples.

DavidArno commented 7 years ago

@orthoxerox,

Apologies: the code was wrong. I've updated it to what I meant to say.

HaloFour commented 7 years ago

@DavidArno

So, how about the compiler first attempts to resolve == operators defined on ValueTuple<...> via extensions and if one isn't found it instead falls back to evaluating the elements individually?

I don't know that I like that, though. If you needed the container of those two values to matter, enough to define their equality, then you probably shouldn't be using tuples and you should define a proper type instead.

orthoxerox commented 7 years ago

@DavidArno Yes, it looks like it will have to be a compiler error, just like this function is right now:

bool F<T1,T2>(T1 x1, T2 x2, T1 y1, T2 y2) => x1 == y1 && x2 == y2;

P.S. I've just spent an hour enumerating all the equality mechanisms of the CLR and I've yet to cover the interfaces.

CyrusNajmabadi commented 7 years ago

You appear to be proposing that, just because the CLR and language teams couldn't agree on whether NaN is the same as NaN

That's not what's happening.

== implement IEEE754 semantics. .Equals impelments .Net equals semantics. IEEE requires that == not be reflexive for NaN. However, .Equals is required to be reflexive. i.e. object.Equals states that "a.Equals(a)" must be true for all instances.

If we broke IEEE semantics that would be a problem for tons of customers. If we broke .net semantics that would be a problem for tons of .net use cases (for example, you could not use a double effectively as a key inside a hashtable). This approach gives both sets of customers the semantics they require.

CyrusNajmabadi commented 7 years ago

ValueTuple<...> implement different behaviour for == and .Equals is pure madness IMO.

No, it really isn't. For example, today, if i have:

byte b = 1;
int i = 1;

Console.WriteLine(b == i);
Console.WriteLine(b.Equals(i));

I will print "True, False".

== and .Equals are already very different.

The problem compounds with tuples. Tuples just aggregate data. If you aggregate data that uses to compare the same with == but now compares differently (or vice versa), then tuples no longer act properly a composition concept.

CyrusNajmabadi commented 7 years ago

Using hte above example, if we do not support == on tuples then you can get the following badness:

byte b;
string s;

(byte, string) t1 = (b, s);
(int, string) t2 = (b, s);

Console.WriteLine(t1.Item1 == t2.Item1 && t2.Item2 == t2.Item2);
Console.WriteLine(t1 == t2); // doesn't currently compile.  But we would like it to.  Can't be implemented through .Equals
Console.WriteLine(t1.Equals(t2));

The top and bottom lines will print 'true', and 'false'. I strongly believe the second line shoudl print 'true' and that it should just be exactly transformed as:

t1 == t2 
// becomes
t1.Item1 == t2.Item1 && ... && t1.ItemN == t2.ItemN

This, of course, then means that nested tuples work properly (something that definitely does not work with .Equals).

DavidArno commented 7 years ago

Ha, ha, thanks @CyrusNajmabadi, the idea of someone trying to use a double as a key in a hashtable is going to keep me chuckling all day! 😁 Even JavaScript, where it'll let black equal white every other Thursday except on leap years doesn't treat NaN as equal. The idea that the .NET team's dogma was so strong that they'd break an international standard just to enable doubles to be used as Dictionary keys is mind-boggling.

Anyway, that aside, you didn't address the important point from above. If a value tuples' types are structs, they may not implement == themselves. So how would you propose this feature work?

CyrusNajmabadi commented 7 years ago

the idea of someone trying to use a double as a key in a hashtable is going to keep me chuckling all day!

Using a double as a key is completely fine to do.

The idea that the .NET team's dogma was so strong that they'd break an international standard just to enable doubles to be used as Dictionary keys is mind-boggling.

It's interesting that you complain about things being inconsistent, but then complain when .net insisted that .Equals behave consistently :)

Anyway, that aside, you didn't address the important point from above. If a value tuples' types are structs, they may not implement == themselves. So how would you propose this feature work?

The exact way that == works for types that don't have an available == operator. I'm stating that == for tuples is defined as:

(x1, ..., xn) == (y1, ..., yn)   <==>  x1 == y1 && ... && xn == yn

== on the tuple will work in precisely all the circumstances that == works for the constituent elements.

just to enable doubles to be used as Dictionary keys is mind-boggling.

I never said that. Please do not put words in my mouth. I simply gave an example of how the .Equals behavior is desirable and consistent. I did not say that it was done "just" for this reason.

--

Also, to be clear, you never addressed the other points i made. For example, that == and .Equals are not equivalent today for ordinary numeric types other than double. Today == will return true when .Equals does not. == follows the rules of C# while .Equals follows the rules of .Net. Making it so that we have a type in C# which composes other C# types, but which does not compose == appropriately just leads to the composition not actually happening fully.

CyrusNajmabadi commented 7 years ago

Note: having .Equals be symmetric is nice for lots of reasons. After all, while it would be very weird to have "a.Equals(a)" be false, it would be even more confusing and difficult to use .net effectively if doubles didn't support this concept. After all, the following could then be false:

list.Add(d);
Console.WriteLine(list.Contains(d))

Effectively you would make it so that doubles could not actually work effectively as any sort of object in any sort of container. You could not use them effectively in linq. They would always be something that never meshed well with the rest of .Net. So .Net implements .Net semantics. If you want IEEE semantics, you can use the simple operators.

CyrusNajmabadi commented 7 years ago

I'd like the table the Double discussion as the interesting and esoteric behaviors of Double are not what are important here. All the reasons for tuples needing == support can be explained with simpler types like int/long.

For people who don't think == should be lifted to tuples, can i ask why you think the following should behave differently:

int i = 0;
long j = 0;

Console.WriteLine(i == j); // what do you think this should print?

(int, int) t1 = (i, i);
(long, long) t2 == (j, j);
Console.WriteLine(t1 == t2); // what do you think this shoudl print?

If you think the answer should be 'true' then 'false', can you explain why you think that two values which previous were == true, should now not compare the same just because they have been packaged up into a pair of values?

DavidArno commented 7 years ago

@CyrusNajmabadi

int i = 0;
long j = 0;

Console.WriteLine(i == j); // what do you think this should print?

I think it should print exactly the same as for:

int i = 0;
long j = 0;

Console.WriteLine(i.Equals(j)); 

Whether it's true or false isn't important - that's just convention.

As Eric Lippert says in his Sharp Regrets: Top 10 Worst C# Features article, equality in C# is one of its top ten worst features. But what is done is done and we must live with it. However, watching you try to defend it, rather than just admitting it was a mistake that we can all learn from, is more than a little frustrating.

I'm still confused as to how you think == will work in tuples though, based on your expression:

(x1, ..., xn) == (y1, ..., yn)   <==>  x1 == y1 && ... && xn == yn

As per my previous example, struct S {}, how will the following work?

var x = (new S(), new S());
var y = (new S(), new S());
if (x == y) ...
Thaina commented 7 years ago

@DavidArno What you proposed might be ideal but it could be breaking change so that's a problem. Sometimes we need to deal with legacy

CyrusNajmabadi commented 7 years ago

I think it should print exactly the same as for:

Ok. Given that that's not how C# has behaved since v1, that seems bad. You are complaining about inconsistency, but would not like C# vNext to be inconsistent with C# v1-v7.

As per my previous example, struct S {}, how will the following work?

var x = (new S(), new S());
var y = (new S(), new S());
if (x == y) ...

it's exactly as i stated, it would be equivalent to:

if (x.Item1 == y.Item1 && x.Item2 == y.Item2)

If S has no == operator then i would expect an error. precisely as i would if i wrote:

var v1 = new S();
var v2 = new S();
Console.WriteLine(v1 == v2);
DavidArno commented 7 years ago

@CyrusNajmabadi,

If S has no == operator then i would expect an error

Good. Right so then on to my next example from above:

bool F<T1,T2>((T1, T2) x, (T1, T2) y) => x == y;

T1 or T2 might be S here, so there's going to have to be an error somewhere. When will the error occur? At compile time or run time?

CyrusNajmabadi commented 7 years ago

However, watching you try to defend it, rather than just admitting it was a mistake that we can all learn from, is more than a little frustrating.

I don't believe it was a mistake.

And i'm not even really trying to defend it (not much to defend IMO). What i'm trying to do is explain to you why it's valuable for == on tuples to retain the semantics that == has today for C#. Actually breaking that is confusing and breaks composition. Making values composable was a direct goal for tuples, and this really impedes with that.

CyrusNajmabadi commented 7 years ago

When will the error occur? At compile time or run time?

At compile time. Precisely the same as if you wrote:

bool F<T1>(T1 x, T1 y) => x == y;

You can't write this today in C#, so i don't see why the presence of tuples would change anything.

CyrusNajmabadi commented 7 years ago

I think it should print exactly the same as for:

To be clear, the behavior you desire would require massively breaking either the majority of C# code out there (if we change the semantics of == to match .Equals for disparate types), or the majority of .net code out there (if we change .Equals to match C#'s ==)

Neither of these are going to fly. And insisting that we go with a completely non-viable approach for equality and then that we also make our new features fit this non-existent reality just not going to go anywhere. Neither .Net or C# are going to break the majority of programs out there. And as such there are going to be different semantics to == and .Equals.

The question then becomes "should == on a tuple behave like == on the constituent parts, or like .Equals?" As you yourself have mentioned many times, the 'principle of least surprise' should be followed. In this case, that principle would lead to == (for tuples) behaving like == (for elements), just as .Equals (for tuples) behaves like .Equals (for elements).

DavidArno commented 7 years ago

@CyrusNajmabadi,

At compile time. Precisely the same as if you wrote...

Ok. So bool F<T1>(T1 x, T1 y) => x == y; gives a compile time error because the compiler knows that T1 might be a struct and that it therefore may not implement ==. But this proposal is to implement == for value tuples. So they aren't the same at all, let alone precisely the same.

Now the compiler has to know that, whilst ValueTuple<T1,T2> implements == (and we'll come to how it's to do that in a second), it must emit an error at compile time if the types of T1 and T2 aren't known, or if they are known to be structs that do not implement ==. That's a big burden you are now laying on the compiler and you are making those tuple types a very special case for the compiler to handle. Is that really something the compiler should be doing?

OK, so we have implemented == in these value tuple types. How exactly? You can't implement it in (T1, T2) as:

public static bool operator ==((T1, T2) left, (T1, T2) right) => 
    left.Item1 == right.Item1 && left.Item2 == right.Item2;

as T1 and T2 aren't constrained to classes, so the above would be a compilation error. But you want == in tuples to invoke == on its items. How are you going to do that?

DavidArno commented 7 years ago

@CyrusNajmabadi & @Thaina,

To be clear, you recommend massively breaking either the majority of C# code out there

To be very clear: I'm doing no such thing. That's the way it should have been implemented. But it wasn't and as I said: "what is done is done and we must live with it". Yes we need to deal with legacy. Thus this whole discussion as there's more than one way of dealing with it, each with their pros and cons. Though I'd argue that just burying our heads and believing no mistake was made, has no pros.

CyrusNajmabadi commented 7 years ago

Now the compiler has to know that, whilst ValueTuple<T1,T2> implements ==

I'm not sure where you're getting that idea from. There is no way for ValueTuple to effectively implement ==. To begin with, it wouldn't know how to statically dispatch the appropriate == operator on all the elements.

As i stated earlier, using == on a tuple is exactly a transformation of:

(x1, ..., xn) == (y1, ..., yn)   <==>  x1 == y1 && ... && xn == yn

That's a big burden you are now laying on the compiler

It's not a burden at all. It is trivial for the compiler to do this check. If the user writes "someTuple == someOtherTuple" then we will create a node for this. Then, during typechecking if we see that these are tuples on both sides we'll first check that the arity is the same. Then we'll go determine if we can find an applicable == operator for each pair of element types between the two tuples. This is extremely simple to do. Emitting would also be just as simple as we already can emit boolean expressions+field comparisons with ease.

OK, so we have implemented == in these value tuple types.

== is not implemented in ValueTuple. That's why this isn't a bug on CoreFx. This is about C# understanding == and emitting appropriate code for it. Just as we do already for things like == between a whole host of other types that are special to the language.

But you want == in tuples to invoke == on its items. How are you going to do that?

I have mentioned this a few times now :). If you write:

t1 == t2

That that is exactly the same as writing:

t1.Item1 == t2.Item1 && ... && t1.ItemN == t2.ItemN

It is the == operator composed over all the constituent elements of the tuple.

CyrusNajmabadi commented 7 years ago

Thus this whole discussion as there's more than one way of dealing with it, each with their pros and cons.

Can you explain why you think it is a pro for == (for tuples) to behave like .Equals (for tuple elements) instead of like == (for tuple elements)? It seems to violate the 'principle of least surprise' that you've desired in the past.

Again, we're past the point about what == means for the language. In C#:

byte b = 1;
int i = 1;
b == i; // this is true

Given that, why is it not surprising to you that the following would be false:

var t1 = (b, b);
var t2 = (i, i);
t1 == t2;  // this is now false??

If 'b' is == to 'i', then why is composing two 'b's together not == to composing two 'i's together?

DavidArno commented 7 years ago

@CyrusNajmabadi,

Ah, my apologies. I misunderstood what you were saying. I hadn't appreciated you were proposing what @orthoxerox suggested: that the compiler transform (x1, ..., xn) == (y1, ..., yn); into x1 == y1 && ... && xn == yn; and then apply type checking to see if it were allowed. So it will be a compiler trick, not an operator. Makes more sense now.

Can you explain why you think it is a pro for == (for tuples) to behave like .Equals (for tuple elements) instead of like == (for tuple elements)?

Sure:

  1. It would work for all tuples, not just for those with items of types that implement ==,
  2. Because it works for all tuples, it could be used with generics,
  3. It would be a simple and consistent approach to teach developers: == on tuples uses eg EqualityComparer against each item to determine equality of the tuple. Usual rules of "beware comparisons in C#, for they are screwy" applies.
  4. By implementing it as an operator, those creating their own versions of value tuples can implement their own behaviour for equality.
orthoxerox commented 7 years ago

I must say David/Cyrus spats are always the highlight of my day. :grin:

@DavidArno there aren't that many types where == and Equals produce different results and most of them are numerics, where the difference is more or less justified.

DavidArno commented 7 years ago

@orthoxerox,

Glad to have been of service. It was lucky that Cyrus didn't read your comment and just say "we'll implement it as a compiler trick as orthoxerox suggests" for the entire spat could have been avoided then and you would have missed out on your fun 😆

jnm2 commented 7 years ago

I don't consider it a compiler trick. The compiler trick is that ValueTuple is involved at all :D tuples != ValueTuple and I wish people would stop equating the two.

CyrusNajmabadi commented 7 years ago

tuples != ValueTuple

tuples are not necessarily ValueTuples. But ValueTuples corresponding to a certain shape are always tuples. As such, any of the == stuff we're talking about would work properly even if you had ValueTuple<int, string> x, y; ... x == y.

jnm2 commented 7 years ago

That's a good way to put it. Stealing that later.

GeirGrusom commented 7 years ago

@DavidArno

What about reference types? The usual assumption is that == for reference types is a reference comparison. Tuples would suddenly break that assumption. I think your proposed behavior could cause quite a lot of late night cursing.

DavidArno commented 7 years ago

@GeirGrusom,

Maybe it's just me, but I cannot begin to understand the thought processes of someone who, for reasons other than ignorance, chooses to override .Equals on a class, but doesn't override == to have exactly the same affect.

GeirGrusom commented 7 years ago

@DavidArno that's not what this is about.

DavidArno commented 7 years ago

@GeirGrusom,

That's odd, as that is exactly what it is about, as far as I'm aware. What do you feel it's about?

GeirGrusom commented 7 years ago

You're applying some standard of behavior among developers and implying that anyone that doesn't share your opinion are poor developers and trying to use that as a point in the discussion. But this isn't about what you think C# developers should be doing. It's about how C# behaves as a language, and suddenly making == act like .Equals in certain conditions is not consistent behavior. You can't defend an odd behavior change because you feel that developers should write code in a certain way because you would totally do it that way.

DavidArno commented 7 years ago

@GeirGrusom,

You're applying some standard of behavior among developers and implying that anyone that doesn't share your opinion are poor developers and trying to use that as a point in the discussion.

I'm sorry to hear that you have drawn that conclusion from what I've said.

It's about how C# behaves as a language, and suddenly making == act like .Equals in certain conditions is not consistent behavior.

There is no consistent behaviour when it comes to C# and equality and we've all had to try and deal with that since v1. As a result, there is no solution that can be adopted for == on value tuples that will result in consistency. The team have to choose between a number of solutions that each introduce their own, different, inconsistencies:

  1. Do nothing and do not support == and != for those tuples. This is actually the most consistent of the options as many structs do not implement those operators. I think this is the option I'm starting to favour.
  2. Have == call .Equals. This provides consistency between the two ways of testing for equality in value tuples and it will work for all of those tuples. It's downside is that two tuples may be equal, even if their properties aren't equal if directly tested.
  3. Implement a more clever system of testing for IComparable, IEquatable, determining if structs are involved etc and using different ways of testing the properties of the tuple. This will result in less cases of tuple equality/item inequality than option 2, but will still lead to those situations at times.
  4. Implement it as a compiler trick as the team appear to be proposing. This will ensure == on a tuple always results in == being used for its properties. This will create consistency in that regard. But it'll be a special case type that doesn't use operators for ==. That special case will only work for some tuples and won't work at all with generics.

Option 4 is by far the worst option in my view. It's sad you think me saying so implies I'm judging the abilities of other developers.

Edited to remove comment about nested tuples due to @jcouv's explanation below

HaloFour commented 7 years ago

@DavidArno

I advocate option 4, not option 1. The elements of the tuple should be directly compared to one another using whatever existing equality semantics the compiler can resolve between the types of their elements. I believe that is the most intuitive and most internally consistent option (consistent in their inconsistencies). Tuples aren't proper containers; the compiler should pretend that they don't exist.

The fact that C# and the CLR allows operator == and Equals to be different is a boat that has long since sailed. I agree that they generally should be the same, but there are reasons for that to not be the case, such as with double. We can't fix that now, and we shouldn't be trying to drag tuples along for that ride.

DavidArno commented 7 years ago

@HaloFour,

My apologies. I could have sworn you'd said as much early in this discussion. But I can't find where I thought you said that, so I'm imagining it. I've removed that false assertion therefore.

jcouv commented 7 years ago

@DavidArno Thanks for summarizing the options.

But it'll be a special case type that doesn't use operators for ==.

Yes, option 4 introduces special compiler behavior for tuples. We've had to add such special handling for type inference, overload resolution, and many other cases already. The guiding principle was to distribute the behavior onto the elements. For instance, picking an overload for M((1, 2), 3); should behave like a flattened non-tuple analog M(1, 2, 3);. Option 4 keeps that approach, which seems sensible.

I don't expect people to write their own ValueTuple types, so the fact that the == on tuples doesn't map to the == on ValueTuple shouldn't really be noticeable. This isn't different to accessing .Item10, which doesn't exist in any ValueTuple.

That special case will only work for some tuples

Yes, option 4 would only work on tuples whose elements can be compared with ==. I'm not sure what that is a negative, as the same rule applies to individual elements (by definition).

won't work at all with generics

I think you're referring to the bool F<T1,T2>((T1, T2) x, (T1, T2) y) => x == y; example. I don't see this example as a downside either, as it behaves the same as the non-tuple case bool F<T1>(T1 x, T1 y) => x == y; (as Cyrus pointed out).

and potentially creates weird effects that'll confuse people when eg tuples of tuples are used.

I looked up the thread and didn't see any examples of nested tuples. Can you provide an example that is confusing?

It seems to me that option 4 works fine with nested tuples. (a1, (b1, c1)) == (a2, (b2, c2)) will mean the same as a1 == a2 && (b1, c1) == (b2, c2) which means the same as a1 == a2 && b1 == b2 && c1 == c2 (which would be the lowered form).