Open eatdrinksleepcode opened 9 years ago
Sort()
is not a good name, because there already is List<T>.Sort()
(and also static Array.Sort()
) which does in-place sort. This means the proposed Sort()
would be confusing and also some of its overloads couldn't be called as extension methods: e.g. list.Sort()
would call the instance method List<T>.Sort()
, not the extension method Enumerable.Sort()
.Comparison<T>
happens often enough to be worth making the API more complicated.I'm not sure using a custom Comparison
happens often enough to be worth making the API more complicated.
Not arguing one way or the other, just pointing out that for such situations where you have a Comparison<T>
(or lambda that converts to it) and need an IComparer<T>
, you can use Comparer<T>.Create
, e.g.
var result = strings.OrderBy(s => s, Comparer<string>.Create((s1, s2) => ...));
@stephentoub Indeed, the proposal shows an example of using Comparer
@svick That's a good point about collisions with the other Sort methods. I chose Sort because OrderBy implies that the thing you are ordering by is part of calling the method, which is not the case for these overloads. i wonder if we can find another appropriate name?
Regarding ComparisonmyComparer.Equals
) than to convert from a Comparison to an IComparer (Comparer<string>.Create(myComparison)
).
It looks like the major purpose of this proposal is to
The main question is - why do we need so many new API entries? Can we still have most of the benefits without bloating the API surface too much? What would be the minimum necessary API addition?
It seems the following API additions would be sufficient:
public static IOrderedEnumerable<T> Sort<T>(this IEnumerable<T> source) where T : IComparable<T>
public static IOrderedEnumerable<T> Sort<T>(this IEnumerable<T> source, Func<T, T, int> comparison)
I do not find "Descending" or "ThenBy" methods very useful when the same key is used and comparer lambda can be trivially reversed.
For the Comparer/IComparer use in Linq in general.
Linq generally avoids pre-Func specialized delegate types and tries to standardize on Func/Action. Also note that lambdas are used mostly to extract/project/wrap the elements of the sequence, while the algorithms are parameterized via interfaces (IComparer, IEqualityComparer, . . ).
public static IOrderedEnumerable<T> Sort<T>(this IEnumerable<T> source) where T : IComparable<T>
public static IOrderedEnumerable<T> Sort<T>(this IEnumerable<T> source, IComparer<T> comparer)
Based on the above points the new API should really use IComparer, but I see how Func<T,T,int>
instead could be more useful.
However, i think we should have only one or another. Having both variants would seem to be unnecessary duplication.
The idea seems reasonable, but need to settle on appropriate shape of the additional API
@VSadov Did you mean Func<T, T, int>
(delegate version of IComparer<T>
), instead of Func<T, T, bool>
(delegate version of IEqualityComparer<T>
)?
@svick - right, Func<T, T, int>
of course
public static IOrderedEnumerable<T> Sort<T>(this IEnumerable<T> source) where T : IComparable<T>
Call that on a variable typed as List<T>
for some T and it will call List
's instance method instead of this.
Even if there were no signature matches, "Sort is in-place, OrderBy produces a new object" is the explanation often given, so the names would match. So I don't think Sort
is a good name.
It seems to me that these are really overloads of OrderBy
and so should be named as such.
I do not find "Descending" or "ThenBy" methods very useful when the same key is used and comparer lambda can be trivially reversed.
Unless the lambda can produce int.MinValue
…
I'm inclined though to think they should be included in this because:
OrderBy
and adding comparable overloads to ThenBy
etc. is more consistent and in fact less to learn that they all have overloads of this form than that only some do.IOrderedEnumerable
is a promise to play nicely with ThenBy
, and if we're going to play nicely with ThenBy
then we've have the work done for the rest of this.The form with no lambda (the first in either of your two alternatives) can be implemented very efficiently by having a form of ComputeKeys
that does _keys = (TKey[])(object)elements
and then the rest of OrderBy
will just work.
The no-lambda form can also be done for IQueryable<T>
easily by returning source.OrderBy(x => x)
, and I think should (perhaps detecting EnumerableQuery<T>
and deferring to its optimised version) and then other providers will get it for free.
The form with a Comparison<T>
-like Func
can be done easily with an IComparer<T>
implementation that takes such a Func
. And/or such a type could just be made part of the API.
@JonHanna
It seems to me that these are really overloads of
OrderBy
and so should be named as such.
They are, except the By
part does not fit for the identity sort. As said previously by @eatdrinksleepcode:
I chose
Sort
becauseOrderBy
implies that the thing you are ordering by is part of calling the method, which is not the case for these overloads.
I agree with that. For example, employees.OrderBy(e => e.Name)
reads "order employees by name", numbers.OrderBy()
is just "order numbers", there is no "by" part. Because of that, I think Order
would be a good name for that "overload".
I could happily live with Order
, though in the SQL world the analogous SELECT id FROM table ORDER BY id
means it wouldn't be that weird to SQL-familiar people while those used to the other overloads would think of it as yet another overload.
Two thoughts:
Some of the proposed APIs have a constraint where T : IComparable<T>
. This seems inconsistent with LINQ's general OrderBy approach, where any TKey is allowed. In general, these constraints are unnecessary because Comparer<T>.Default
is available for any T
(although some calls to Compare()
may fail at runtime. Despite the possibility for runtime failures, I think LINQ's more permissive approach is the right one. For example, it allows for non-generic IComparable
s.
With regards to the name, OrderBy()
with no arguments doesn't make much sense. However, OrderBy(Comparer)
makes a lot of sense.
So it's been a little while, but I think this idea is still worth pursuing. If I can sum up the feedback:
List<T>.Sort
(and to a lesser extent, the static Array.Sort
). "OrderBy" doesn't read correctly without a key selector. Is "Order" the best alternative?IComparer<T>
and Comparison<T>
creates a lot of overloads. Accepting only IComparer<T>
makes the usage more verbose. Accepting only Comparison<T>
is inconsistent with the rest of LINQ. I am personally still inclined to provide both sets of overloads, since I don't find the overloads confusing, and they keep consuming code clean for both kinds of usage.IComparable<T>
on some overloads disallows non-generic comparables, and is inconsistent with existing OrderBy
methods. While I would prefer that the case of attempting to sort non-comparable items be handled by the type system instead of by a runtime exception, as a practical matter the constraint may have to be removed.OrderBy
overloads, and so should be done to maintain IEnumerable/IQueryable parity.Does anyone care to weigh in any further on any of these issues? I am happy to make the necessary changes to the proposal. Are there any other issues that need to be addressed?
/cc @joshfree @VSadov
One implication of adding Queryable.Order<T>(this IQueryable<T>)
is that existing query providers (e. g. EF) will not support this new query operator. Queryable.Order<T>(this IQueryable<T>, IComparer<T>)
would also not be supported, but since most query providers don't support custom comparers anyway that would be less surprising to consumers.
One argument for using Sort
over Order
would be consistency with the existing Reverse
, which also is hidden by List<T>.Reverse
.
A couple of thoughts:
x.Sort()
might imply that it's an in-place sort, even though that is not the case. Perhaps we could call it something like x.Order()
?ThenBy
), I just project to a tuple which already supports comparison.While exposing an OrderBy overload that doesn't require any arguments is valuable for simple scenaria, I think we could certainly trim down the size of the proposed API additions:
IComparable<T>
constraint since it doesn't apply to TKey
types.Comparison<T>
overloads since that type isn't used at all inside existing System.Linq APIs.ThenBy
overloads can be omitted since they express more complex scenaria that should be covered by the existing APIs.So I was thinking something like the following:
public static class Enumerable
{
public static IOrderedEnumerable<TSource> OrderBy<TSource>(this IEnumerable<TSource> source, IComparer<TSource> comparer);
public static IOrderedEnumerable<TSource> OrderByDefaultComparer<TSource>(this IEnumerable<TSource> source);
public static IOrderedEnumerable<TSource> OrderByDescending<TSource>(this IEnumerable<TSource> source, IComparer<TSource> comparer);
public static IOrderedEnumerable<TSource> OrderByDescendingDefaultComparer<TSource>(this IEnumerable<TSource> source);
}
Another possibility for the no-arguments version could be InOrder
/InDescendingOrder
. This does a good job of communicating that this returns a sequence which is sorted rather than sorting the sequence (FWIW I think Enumerable.Reverse
might have been better off as InReverse
.
var sorted = items.InOrder();
I think this reads very naturally except when followed by ThenBy
. Some of the other proposals like Sort
have the same issue. That said, it should be rare to follow a default sort with ThenBy
since I would guess that most default comparers break all ties.
Summary
There are two scenarios where the existing Enumerable OrderBy methods are not ideal:
ints.Sort(x => x)
I propose adding various overloads of Sort to address these situations. The name Sort is chosen specifically to distinguish these methods, which do not take a key selector, from the OrderBy methods, which do. If this is not considered an important distinction, the names could be changed to OrderBy without colliding with the existing methods.
API
Usage
Questions