Open hickford opened 8 years ago
Suggestion, inherit to a new interface and implement that.
ISortedList<T> : IList<T>
{
int BinarySearch(T item)
...
}
@benaadams What's the point? Shouldn't binary search be the same for each implementation of IList<T>
?
Not if someone has implemented the IList<T>
interface on their class, but not the BinarySearch
function then you've just broken their code.
Adding a new interface that derives from IList<T>
and adding it to, for example List<T>
and Array
means its a compatible addition that doesn't break any upstream code.
@benaadams I was assuming it would be an extension method. That way, it's easy to use and you don't need to reimplement it for every collection.
@svick ah, yeah, that could work :D
Probably would want a matching Sort function to get it in right state?
Perhaps it could be useful to add a collection independent helper methods:
Perhaps
int BinarySearch(int start, int length, Func<int, int> compare)
where compare
maps an index to a comparison result.
Or perhaps
int BinarySearch<T>(Func<int, T> getElementAt, int index, int length, T value, IComparison<T> comparison)
I do like the idea of having extension methods for sorting and doing a binary search through the IList
Ideally, you would only need a single implementation of an algorithm like binary search for collections that provide random access. The problem is that System.Array and System.Collections.Generic.List
If someone has a good idea so that Array and List
If someone has a good idea so that Array and List can share the same generic implementation as an extension method for IList, I'd be interested in hearing it. Otherwise, I wonder if people would just want to bite the bullet and duplicate the code in some other assembly.
List< T > uses the Array.BinarySearch implementation on it's internal array. There isn't a way to have the IList extension method do the same since we don't have access to any underlying array, just the item getters. Therefore, it would need to be distinct (i.e. duplicated) from the implementation in Array.BinarySearch regardless of where it lives.
Otherwise, I wonder if people would just want to bite the bullet and duplicate the code in some other assembly.
I don't think it should go in mscorlib, so the next logical place would be in the System.Collections partial facade in CoreFX.
Probably would want a matching Sort function to get it in right state?
Imo this is an overdue addition. There are a ton of people who want to perform sorting on an IList but their options are limited to:
I did the implementation of BinarySearch for IList<T>
which basically mimics binary search in Array.cs
. Points to note here are
TrySZBinarySearch
is not available so we are missing the faster C++ search. Any thoughts on this?IList
. IList<T>
does not implement IList
and I was not able to get a good way to convert between them. The only option I could get is to create a wrapper class to convert between them. For now I used Func as in below snippet. Will the Func work or should we go with some kind of wrapper? I hope using Func does not have any performance impact.IListExtensions.cs
in System.Collections
in corefx. private static int BinarySearch(int index, int count, Func<int, int> compare)
{
...
while (low <= hi)
{
int i = GetMedian(low, hi);
int c;
try
{
c = compare(i);
... //Full Binary Search Logic
}
public static int BinarySearch<T>(this IList<T> list, int index, int count, T item, IComparer<T> comparer)
{
... //Validation
return BinarySearch(index, count, i => comparer.Compare(list[i], item));
}
public static int BinarySearch(this IList list, int index, int count, object item, IComparer comparer)
{
... //Validation
return BinarySearch(index, count, i => comparer.Compare(list[i], item));
}
To maintain consistency I was planning to mimic/reuse same test cases which are used to test List BinarySearch. I see there are two versions List_List_BinarySearchOv1
and List_List_BinarySearchOv2
. What are the differences between them?
- TrySZBinarySearch is not available so we are missing the faster C++ search. Any thoughts on this?
This is unavoidable. TrySZBinarySearch requires an array to access but from the IList interface all we have is getters - we do not have any underlying array to pass a reference to.
- I assume we would also want same capability on IList. IList
does not implement IList and I was not able to get a good way to convert between them. The only option I could get is to create a wrapper class to convert between them.
Why do you want to convert between them? Since there's no guarantee that a non-generic IList will contain all elements of type T, the two may not be used interchangeably.
For now I used Func as in below snippet. Will the Func work or should we go with some kind of wrapper? I hope using Func does not have any performance impact.
I'm not too well-versed in how delegates get compiled, but I'd imagine there is some overhead since they're subject to change during execution and thus cannot be inlined. BinarySearch only takes like 10 lines of code so imo a little bit of duplication is preferable to adding unnecessary overhead/complication via the use of delegates to wrap the comparers.
Also where would this go? I was planning to keep this in IListExtensions.cs in System.Collections in corefx.
Seems reasonable.
To maintain consistency I was planning to mimic/reuse same test cases which are used to test List BinarySearch. I see there are two versions List_List_BinarySearchOv1 and List_List_BinarySearchOv2. What are the differences between them?
Those tests are ported to xunit from an old framework that was last used years ago. I do not know the difference, nor would I suggest you spend time figuring it out and replicating them as they will be removed in dotnet/runtime#14117.
I did the implementation for binary search however I am unable to write test cases. The extension method just doesn't show up in System.Collections.Tests project. Digging further I found that the definition of IList<T>
in System.Collections is at mscorlib
#region Assembly mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e
// D:\net\corefx-master\packages\Microsoft.TargetingPack.Private.CoreCLR\1.0.0-rc2-23520\ref\dnxcore50\mscorlib.dll
#endregion
and in System.Collections.Tests it is at System.Runtime
#region Assembly System.Runtime, Version=4.0.20.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
// D:\net\corefx-master\packages\System.Runtime\4.0.20\ref\dotnet\System.Runtime.dll
#endregion
Since both the definitions are not same, the implemented extension methods are not available in test projects. What is the workaround?
I have intentionally not added sort logic. It seems there are multiple open issues regarding sort such as dotnet/runtime#15803 and dotnet/runtime#14800 . Isn't it better we resolve them first before duplicating the code again? I really like idea of abstracting the sort functionality out and make the underlying algorithm interchangeable.
Ditto.
I'm partial to what I've been using:
public enum BinarySearchType
{
/// <summary>
/// The standard BCL binary search. Returns the index of the first equal item found,
/// or if there are no equal items, the two's complement of index where it would be inserted.
/// </summary>
Fast = 0,
/// <summary>
/// Return the index of the first equal item, or if there are no equal items, the index where it would be inserted.
/// </summary>
PrependIndex,
/// <summary>
/// Return the index following the last equal item, or if there are no equal items, the index where it would be inserted.
/// </summary>
AppendIndex
}
public static class BinarySearchExtensions
{
private interface IInlineComparer<T>
{
int Compare(T value);
}
private static int BinarySearchInline<T, TCalculator>(this IReadOnlyList<T> sortedList, TCalculator calculator, BinarySearchType searchType)
where TCalculator : IInlineComparer<T>
{
if (calculator == null) throw new ArgumentNullException(nameof(calculator));
var high = sortedList.Count - 1;
var low = 0;
switch (searchType)
{
case BinarySearchType.Fast:
while (low <= high)
{
var mid = (high + low) >> 1;
var comparison = calculator.Compare(sortedList[mid]);
if (comparison == 0) return mid;
if (comparison < 0)
low = mid + 1;
else
high = mid - 1;
}
return ~low;
case BinarySearchType.AppendIndex:
while (low <= high)
{
var mid = (high + low) >> 1;
var comparison = calculator.Compare(sortedList[mid]);
if (comparison <= 0)
low = mid + 1;
else
high = mid - 1;
}
return low;
case BinarySearchType.PrependIndex:
while (low <= high)
{
var mid = (high + low) >> 1;
var comparison = calculator.Compare(sortedList[mid]);
if (comparison < 0)
low = mid + 1;
else
high = mid - 1;
}
return low;
default:
throw new InvalidEnumValueException(searchType);
}
}
And then I've got:
public static int BinarySearch<T, TComparable>(this IReadOnlyList<T> sortedList, TComparable comparableValue, BinarySearchType searchType = BinarySearchType.Fast)
where T : IComparable<TComparable>
{ /* ... */ }
public static int BinarySearch<T, TOrder>(this IReadOnlyList<T> sortedList, Func<T, TOrder> orderSelector, TOrder value, IComparer<TOrder> comparer = null, BinarySearchType searchType = BinarySearchType.Fast)
{ /* ... */ }
public static int BinarySearch<T>(this IReadOnlyList<T> sortedList, T value, IComparer<T> comparer = null, BinarySearchType searchType = BinarySearchType.Fast)
{ /* ... */ }
}
Entire thing: https://gist.github.com/jnm2/9bbdc74c5f2ec835a2af2a88ca8c100d
Triage: I would assume that efficient binary search on an IList<T>
instance implies that the indexing operation is O(1), however we have classes such as ImmutableSortedSet<T>
that do implement the interface, however indexing in that case is O(log n). In that sense I would personally not be in favour of implementing this as an extension method.
With the new C#8 feature of providing default implementations of methods on interfaces, would it be worth considering adding a "efficient" default implementation on IList<T>
which would allow classes implementing this interface to have the option to override the default implementation if it can be done more efficiently?
@eiriktsarpalis This means that a binary search would be O(log n * log n), which doesn't seem like the worst thing in the world? Even so, I think most people would reasonably assume that list indexing is an O(1) operation, which means that ImmutableSortedSet<T>
is the odd man out here. I don't think we'd want to punish all consumers (the majority of which are using T[]
or List<T>
) just because a small fraction of consumers might be using ImmutableSortedSet<T>
.
To me running binary search over a list implies that we care about performance (otherwise one might as well use a dedicated data structure like SortedList). Having an extension method over IList<T>
seems to somewhat negate that motivation, personally I would just use Array.BinarySearch
. Exposing it as a DIM might be better but it's unlikely we'd ever add DIMs to existing interfaces due to the potential for breaking derived types in user apps.
It seems that there is some demand for this API. The following StackOverflow question has 45 upvotes:
How to perform a binary search on IList<T>
?
API proposal, targeting the IReadOnlyList<T>
interface:
namespace System.Collections.Generic
{
public static class CollectionExtensions
{
public static int BinarySearch<T>(this IReadOnlyList<T> list, T item);
public static int BinarySearch<T>(this IReadOnlyList<T> list, T item, IComparer<T>? comparer);
public static int BinarySearch<T>(this IReadOnlyList<T> list, int index, int count, T item, IComparer<T>? comparer);
}
}
The overloads and the arguments are the same with the List<T>.BinarySearch
method, as well as the Array.BinarySearch<T>
method.
Adding the same extension method to the IList<T>
interface results in CS0121 compiler errors for collections that implement both interfaces.
Actually I am not sure if the IList<T>
or the IReadOnlyList<T>
is the correct interface to target. My incentive for this proposal was to use it for searching in the Keys
of the SortedList<TKey,TValue>
collection, which has a type of IList<T>
. Currently this collection lacks the binary search functionality (#24022), and adding it to the IList<T>
interface would fill this gap.
SortedList<string, string> list = new(StringComparer.OrdinalIgnoreCase);
int index = list.Keys.BinarySearch(key, list.Comparer);
I faced with the same issue - binary search over SortedList.Keys
.
In .NET 9 IList<T>
will implement IReadOnlyList<T>
#31001, so maybe it is time to implement binary search for IReadonlyList<T>
?
We have List.BinarySearch and Array.Search for searching sorted lists and arrays. But what about other collections that implement
IList<T>
orIReadOnlyList<T>
? It would be useful to have similar functions (extension methods) to search them. Binary search is a fundamental algorithm, so it's helpful to have a fast reliable well-documented implementation in the standard library. Otherwise developers waste time reinventing the wheel.A workaround is calling
.ToList()
or.ToArray()
and then searching the new object, but that's not ideal because it copies the collection.