Closed AlexRadch closed 4 years ago
It's not clear to me what value the new Sort
overloads have. You can already do anything you want using the existing Sort(Comparison<T>)
overload. The new overload is also likely to have worse performance due to the fact that a single compare requires 2 delegate calls and possibly another interface call for the actual comparison (via IComparer<TKey>
).
To make things worse, the underlying sort implementation is currently built around IComparer<T>
. So, short of duplicating the existing implementation, the only way to implement this overload is to make a custom comparer that wraps the key selector and the key comparer and have the existing sort implementation make yet another interface call to that custom comparer.
The only time such overloads are likely to be useful is when you happen to already have a selector method and want to reuse it. Which you can do anyway using the existing Sort(Comparison<T>)
overload.
The BinarySearch
overloads are more interesting because BinarySearch
is rather crippled as it is:
Sort
it lacks Comparison<T>
overloadsBinarySearch
overloads.These overloads would allow us to write:
int index = list.BinarySearch(42, item => item.Id);
However, these overloads have the same performance issues that the new Sort
overloads have. Each compare requires a delegate call (keySelector) and an interface call (keyComparer). And like the Sort
implementation the existing BinarySearch
implementation is based on IComparer<T>
so there's yet another interface call. At least the binary search implementation is much simpler and duplicating the existing implementation shouldn't be a problem.
For the record: this is derived from https://github.com/dotnet/coreclr/issues/2188 where other BinarySearch
overloads have been proposed. Unfortunately those introduce ambiguity in overload resolution so they're likely not an option.
It's not clear to me what value the new Sort overloads have. You can already do anything you want using the existing Sort(Comparison
) overload.
Write compassion for next code
var keySelector = Person p => Tuple.Create(p.FirstName, p.LastName);
persons.Sort(keySelector);
var i = persons.BinarySearch(Tuple.Create("Alex", ""), keySelector);
BinarySearch(comparison)
was rejected because it have conflict with BinarySearch(IComparable)
. See closed PR.
The new overload is also likely to have worse performance due to the fact that a single compare requires 2 delegate calls and possibly another interface call for the actual comparison (via IComparer
).
Why are you think that selectKey will worse performance here but it does not worse performance in Linq?
I clear understand you suggestions, and I wrote PR to add BinarySearch(comparison)
but as you wrote: it was rejected
Unfortunately those introduce ambiguity in overload resolution so they're likely not an option.
Write compassion for next code
It's doable with existing overloads but it's a bit verbose:
Comparison<Person> comparison = (Person x, Person y) => ((IComparable)Tuple.Create(x.FirstName, x.LastName)).CompareTo(Tuple.Create(y.FirstName, y.LastName));
list.Sort(comparison);
list.BinarySearch(new Person { FirstName = "Alex", LastName = "" }, Comparer<Person>.Create(comparison));
It's better with C# 7:
Comparison<Person> comparison = (Person x, Person y) => (x.FirstName, x.LastName).CompareTo((y.FirstName, y.LastName));
list.Sort(comparison);
list.BinarySearch(new Person { FirstName = "Alex", LastName = "" }, Comparer<Person>.Create(comparison));
BinarySearch(comparison) was rejected because it have conflict with BinarySearch(IComparable). See closed PR.
Yes, I know that. But perhaps there are other alternatives, for example considering changing the order of parameters so the ambiguity is avoided.
Why are you think that selectKey will worse performance here but it does not worse performance in Linq?
I don't care what LINQ does, each API should stand on its own performance wise. In any case, if memory serves me well LINQ builds an array of keys and sorts that. So the key selector is called only once per element. Doing the same for Sort
isn't feasible as it require allocating an array of keys and people don't expect Sort
to allocate. The selector will then be called 2 N log N times instead of N like LINQ does.
I'm not against to add API with Comparison, I tried but it was rejected. I do not understand why you against keySelector.
I don't care what LINQ does, each API should stand on its own performance wise. In any case, if memory serves me well LINQ builds an array of keys and sorts that. So the key selector is called only once per element. Doing the same for Sort isn't feasible as it require allocating an array of keys and people don't expect Sort to allocate. The selector will then be called 2 N log N times instead of N like LINQ does.
How many times does your comparison Comparison<Person> comparison = (Person x, Person y) => ((IComparable)Tuple.Create(x.FirstName, x.LastName)).CompareTo(Tuple.Create(y.FirstName, y.LastName));
create tuples? I think about 2 N log N times.
I thinks that tuples does not allocated on heap. It is structure. Compiler allocates them on stack like int.
How many times does your comparison Comparison
comparison = (Person x, Person y) => ((IComparable)Tuple.Create(x.FirstName, x.LastName)).CompareTo(Tuple.Create(y.FirstName, y.LastName)); create tuples? I think about 2 N log N times.
Only N log N times.
Only N log N times.
Comparison creates 2 Tuples on each compare. With keySelector you create one tuple for middle element and one for each compare (QuickSort). So selector is faster.
Comparison creates 2 Tuples on each compare. With keySelector you creates one tuple for middle element and one for each compare (QuickSort). So selector is faster.
Your version does the same thing unless you are considering a sort implementation that works like the LINQ one - call the selector for each person, store the tuples in an array and then sort the array.
Also, the C# 7 version doesn't allocate any objects, C# tuples use ValueTuple
instead of Tuple
.
Since this hasn't gotten much interest since December, I'm going to close it for now. If anyone else wants to take control and restart the discussion, feel free to re-open this issue.
Personally I agree with @mikedn that these overloads increase the API footprint but don't provide much value since the existing Comparer overload for Sort is mostly adequate.
Now we can sort Array
and List by keySelector Func<TValue, TKey> by using Enumerable extension methods, but this will create in memory copy of array or list.
How could we? I tried searching BinarySearch
in the API doc but none of the overloads contains keySelector
...
How could we? I tried searching BinarySearch in the API doc but none of the overloads contains keySelector …
The text you quote refers to sorting but you're looking for BinarySearch
functionality. As mentioned in a previous comment, such functionality does not exist.
Now we can sort
Array<T>
andList<T>
by keySelectorFunc<TValue, TKey>
by using Enumerable extension methods, but this will create in memory copy of array or list. I suggest to add API for in place Sort and BinarySearch with keySelector functionFunc<TValue, TKey>
.API proposal
The same API for Array class.