Open mairaw opened 7 years ago
Might be something for @GuardRex to work on next. We need to define where we're going to link that from with @ianhays.
@ianhays @mairaw @JonHanna
[@JonHanna :rocket: Hope all is well with you!]
@ianhays @mairaw -- See if this looks good, or let me know what you want to do:
IComparer
and IComparer<T>
API reference topics?IComparer
and IComparer<T>
? If so, I'll draft a list to float here for discussion.Comparison and sorting methods that rely upon
IComparer
orIComparer<T>
must obey the properties of a total order, which define object equality relationships that make comparisons and sorting possible. Failure to do so can result in a failure to compare or sort the objects, an exception, or an infinite loop. For more information, see the [Total Order](LINK_TOTAL_ORDER_TOPIC) topic.
@mairaw @BillWagner This one has run out beyond 7/31. Do u want to re-assign?
Do we want to link from methods relying on IComparer and IComparer
?
I think so. There's sometimes a reasonable reason to not have total order in a comparer, after all, but a method that needs it, and doesn't get it, is where bad things can happen to good people. The Sort
methods on lists and arrays, and Linq's OrderBy
method and its cousins are all methods that need it. Linq's Min
and Max
overloads that take a comparer can also give confusing results without total ordering too (but won't go into an infinite loop as could potentially happen with some inputs to sorts).
Tagging area owners @eiriktsarpalis @layomia to continue the discussion.
@dbjorge commented on Tue Dec 06 2016
The documentation for
IComparer
(@docs.microsoft.com, @msdn) andIComparer<T>
(@docs.microsoft.com, @msdn) do not note whether any particular ordering properties are expected of a valid implementation, but at leastList<T>.Sort
assumes some (antisymmetry, per this Connect feedback circa .NET 4.5).The interface documentation should note what sort of ordering properties the rest of the standard library will be assuming/demanding of a
Compare
implementation. I'd expect this to be the usual properties of a total order.@CIPop commented on Tue Dec 06 2016
cc: @rpetrusha, @AArnott
@karelz commented on Tue Dec 06 2016
cc: @mairaw
@JonHanna commented on Tue Dec 06 2016
Is that a responsibility of
IComparer<T>
or of the caller ofSort
(orOrderBy
etc.). If one was going to implement an order that wasn't total (e.g. XmlSchema's rules on comparing datetimes without time zones, would doing so in anIComaprer<T>
be disallowed?@dbjorge commented on Tue Dec 06 2016
I would consider it reasonable to either:
IComparer
interfacesIComparer
use sites may require certain ordering properties (with examples).I think either of those would be a big improvement over not documenting it either way. I could be convinced that the former would be too big a breaking change to be acceptable.
In either case, I think it would improve clarity to also document it on the individual
IComparer
-based framework methods.@JonHanna commented on Tue Dec 06 2016
I think the warning is a good idea. I think the idea of documenting the relevant methods, even more so.
@mairaw commented on Tue Dec 06 2016
I own these docs so I'd be the one making the updates. Just let me know who I should be working with to figure out what's the guidance we want to provide there. /cc @karelz
@karelz commented on Wed Dec 07 2016
@dbjorge @JonHanna can you please suggest which text to change to what? (ideally mention the diff or old & new text) Thanks!
@JonHanna commented on Wed Dec 07 2016
Could an extra page be added describing total ordering? A concise statement linking to it would probably be clearer and with less text added to the individual pages affected.
@mairaw commented on Wed Dec 07 2016
@JonHanna yes, it's possible to add a separate article explaining the concepts and then link to it from the affected pages.
@JonHanna commented on Mon Jan 09 2017
Total Order
A total order is a binary relation that has the following properties:
a <= b && b <= a
thena == b
a <= b && b <= c
thena <= c
a <= b || b <= a
In other words:
Exceptions to Total Order
It can perhaps be easier to think of total order by considering cases that don’t have total order.
Floating point operators.
Because
double.NaN <= double.NaN
returnsfalse
,double.NaN < 0
returns false and0 < double.NaN
returns false, we cannot use the arithmetic operators to determine a total order on a set of floating point numbers that includesNaN
.XML Schema Datetimes without Time Zones
XML Schema defines some datatypes that are only partially ordered. One example is
dateTime
which represents a date and time without a timezone (comparable toDateTime
in .NET). By the rules of XML Schema, 2001-03-29T12:00:57 is always sorted before 2001-04-23T09:17:39 and 2001-04-23T11:31:09 is always sorted before 2001-04-29T23:09:10 but 2001-04-23T09:17:39 can be considered neither before nor after 2001-04-23T11:31:09.Not knowing the time zone, it could be e.g. 2001-04-23T09:17:39 in Copenhagen compared to 2001-04-23T11:31:09 in Dublin (earlier), or 2001-04-23T09:17:39 in Montreal compared to 2001-04-23T11:31:09 in Bangalore (later).
To offer a dependable ordering within the international context XML Schema time-related data types consider, one
dateTime
can only be considered as sorting before another if that holds for any of a possible range of timezones from -14:00 to +14:00 to have full confidence that the ordering would hold.Importance or Total Order to Sort operations.
Many methods, especially sorting methods, that take an
IComparer
orIComparer<T>
require that theCompare()
methods of those objects give a total order. Failure to do so can result in the results not being sorted at all, exceptions, or even infinite loops. Examples includeList<T>.Sort()
,Array.Sort()
andEnumerable.OrderBy()
.Other methods may have similar issues, such as those that find minima and maxima, such as Linq's
Min()
andMax()
extension methods.Providing a Total Order on Partially Ordered values.
If you want to sort values that are not normally totally ordered, you need to define a total ordering for it. For example, as mentioned above
double
is not totally ordered becauseNaN
cannot be placed into any meaningful position by the<
,<=
,=
,>
and>=
operators.Comparer<double>.Default
,Enumerable.Min()
andEnumerable.Max()
all solve this problem by adding their own rule that sortsNaN
before any other value. This results in the three properties mentioned at the start of this article applying to the ordering.An approach that works for many cases, is to add a tie-breaker rule onto a partial ordering. This is used here, with most comparisons being the same as with
<
but the "NaN equal to NaN and before everything else" providing that tie-breaking rule that makes it a total order.@karelz commented on Wed Dec 14 2016
@ianhays @Priya91 can you please review the text as area experts? (or pull in appropriate folks please)
@ianhays commented on Thu Dec 15 2016
@JonHanna that's a great doc on total order. 👍very concise and informative. The DateTime section is a bit verbose but explains the reasoning behind the exception to total order well.
@JonHanna commented on Thu Dec 15 2016
I split up the sentences and hopefully clarified a bit (sometimes attempts at concision end up reading more verbose than otherwise) and added a sentence at the end about using tie-breaking rules in creating total orders out of parial.
@JonHanna commented on Thu Dec 15 2016
Also fixed the copy-pasting in the example dates, as I had Montreal and Bangalore in time zones over a week apart :grin:
@svick commented on Fri Dec 16 2016
@JonHanna
Did you mean
Comparer<double>.Default
?EqualityComparer
does not define ordering.@JonHanna commented on Fri Dec 16 2016
@svick I did indeed. Fixed.
@ianhays commented on Mon Feb 13 2017
@mairaw did you get a chance to take a look at this?
@mairaw commented on Mon Feb 13 2017
Not yet @ianhays. I'm completely swamped at the moment with the .NET Core release.
@ianhays commented on Mon Feb 13 2017
Gotcha, thanks for the update :)