dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.12k stars 4.7k forks source link

[API Proposal]: BinarySearch on SortedList<TKey, TValue> #99448

Open Rob-Hague opened 7 months ago

Rob-Hague commented 7 months ago

Background and motivation

SortedList<TKey, TValue> exposes int IndexOfKey(TKey key) which performs a binary search on the sorted keys array and returns "The zero-based index of key within the entire SortedList<TKey,TValue>, if found; otherwise, -1."

List<T> exposes int BinarySearch(T item) which when a value is not found, returns "a negative number that is the bitwise complement of the index of the next element that is larger than item or, if there is no larger element, the bitwise complement of Count."

Often I want to find the index in my SortedList representing the highest key <= my TKey, or the smallest key >= my TKey etc. There are similar asks in #24022 and on stackoverflow

This proposal is to expose int BinarySearch (TKey key) which has the same return behaviour as that on List<T>.

API Proposal

namespace System.Collections.Generic;

public class SortedList<TKey, TValue>
{
    /// <returns>
    /// The zero-based index of <paramref name="key"/> in the <see cref="SortedList{TKey, TValue}"/>,
    /// if <paramref name="key"/> is found; otherwise, a negative number that is the bitwise complement
    /// of the index of the next key that is larger than <paramref name="key"/> or, if there is no
    /// larger element, the bitwise complement of <see cref="Count"/>.
    /// </returns>
    public int BinarySearch(TKey key);
}

API Usage

var list = new SortedList<DateTime, string>
{
    { new DateTime(2024, 3, 1), "thefirst" },
    { new DateTime(2024, 3, 4), "thefourth" },
    { new DateTime(2024, 3, 5), "thefifth" },
};

int index = list.BinarySearch(new DateTime(2024, 3, 2));
if (index < 0)
{
    index = ~index;
}
Console.WriteLine(list.Values[index] == "thefourth");

Alternative Designs

Strongly-type SortedList<TKey, TValue>.Keys as returning SortedList<TKey, TValue>.KeyList and make the latter public (note: it is public in src but not in ref), then expose the method on there.

namespace System.Collections.Generic;

public class SortedList<TKey, TValue>
{
-   public IList<TKey> Keys { get; }
+   public KeyList Keys { get; }

-   class KeyList : IList<TKey>
+   public class KeyList : IList<TKey>
    {
+       public int BinarySearch(TKey key);
    }
}

Usage:

int index = list.Keys.BinarySearch(new DateTime(2024, 3, 2));

Risks

No response

eiriktsarpalis commented 7 months ago

It seems unlikely that we would expose a second binary search method just to make this behaviour possible. We might consider exposing it as an overload:

public int IndexOfKey(TKey key, bool encodeNearestMatchAsBitwiseComplement);

Or we could just update the existing IndexOfKey documentation & implementation to just return what the underlying Array.BinarySearch gives back. Changing the contract of the return value to be negative instead of specifically -1 sounds like an acceptable breaking change to me. @bartonjs @stephentoub thoughts?

Clockwork-Muse commented 7 months ago

.... I guarantee that people have:

public const int NOT_FOUND = -1;

somewhere....