dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.35k stars 4.74k forks source link

[API Proposal]: Missing constructor for SortedList #95131

Open micampbell opened 11 months ago

micampbell commented 11 months ago

Background and motivation

SortedList still has value in many scenarios (despite overlap with PriorityQueue and Linq sorting methods). SortedList is preferred over other collections: when most of the data is provided upfront (at the constructor), when performing common list indexing and when duplicate keys are unavoidable. However, no constructor allows for receiving keys and values outside of the one receiving a dictionary (which does not allow for duplicate keys). The current solution is to add the key and value through the Add method, which is slower than the single Array.Sort performed in the constructor. Therefore, one or two new constructors are recommended for SortedList that can receive the keys and values without the need for a dictionary as input.

API Proposal

This is an easy modification since the existing constructor (https://github.com/dotnet/runtime/blob/bed122bb15c5136d4f1aa2e8e9c44608dbbebc8f/src/libraries/System.Collections/src/System/Collections/Generic/SortedList.cs#L149) only references the dictionary to get the count and to copy the keys and values arrays. This could call another constructor which either 1) receives the keys and values as separate arguments, 2) receives them as ICollection.

So, it is proposed to replace the above-referenced constructor with the following:


        public SortedList(IDictionary<TKey, TValue> dictionary, IComparer<TKey>? comparer)
             : this(dictionary.Keys, dictionary.Values, comparer)
        { }

        public SortedList(ICollection<KeyValuePair<TKey, TValue>> keyValuePairs, IComparer<TKey>? comparer)
        : this(keyValuePairs?.Count ?? throw new ArgumentNullException(nameof(keyValuePairs)), comparer)

        {
            int count = keyValuePairs.Count;
            if (count != 0)
            {
                this.keys = keyValuePairs.Select(kvp=> kvp.Key).ToArray();
                this.values = keyValuePairs.Select(kvp => kvp.Value).ToArray();
                Debug.Assert(count == this.keys.Length);
                if (count > 1)
                {
                    comparer = Comparer; // obtain default if this is null.
                    Array.Sort<TKey, TValue>(keys, values, comparer);
                    for (int i = 1; i != keys.Length; ++i)
                    {
                        if (comparer.Compare(keys[i - 1], keys[i]) == 0)
                        {
                            throw new ArgumentException(SR.Format(SR.Argument_AddingDuplicate, keys[i]));
                        }
                    }
                }
            }

            _size = count;
        }

        public SortedList(ICollection<TKey> keys1, ICollection<TValue> values1, IComparer<TKey>? comparer)
            : this(keys1?.Count ?? throw new ArgumentNullException(nameof(keys1)), comparer)
        {
            int count = keys1.Count;
            if (count != values1.Count)
            {
                //I'm currently writing this outside of the the repo and do not have access to the SR class
                //throw new ArgumentException(SR.Format(SR.Argument_AddingDuplicate, keys[i]));
            }
            if (count != 0)
            {
                TKey[] keys = this.keys;
                keys1.CopyTo(keys, 0);
                values1.CopyTo(values, 0);
                Debug.Assert(count == this.keys.Length);
                if (count > 1)
                {
                    comparer = Comparer; // obtain default if this is null.
                    Array.Sort<TKey, TValue>(keys, values, comparer);
                    for (int i = 1; i != keys.Length; ++i)
                    {
                        if (comparer.Compare(keys[i - 1], keys[i]) == 0)
                        {
                            throw new ArgumentException(SR.Format(SR.Argument_AddingDuplicate, keys[i]));
                        }
                    }
                }
            }

            _size = count;
        }

API Usage

As an example, I may have a list of items that I will evaluate to a metric upon which I would like to sort.

var items = new List<object> { 'a', 'b', 'c' };
var keys = items.Select(i => Evaluate(i)).ToList();

// now send both collections to a SortedList. As stated earlier, one compelling use-case is when keys are unavoidably duplicate. 
var sortedList = new SortedList(keys, items, new NoEqualSortComparer());

Alternative Designs

n/a

Risks

The risk is that the collection of keys and values may not line up directly. The alternative is to allow the existing constructor to receive ICollection as opposed to IDictionary. This is implemented in the new code above but requires that some changes be made to the constructor body to separate the KeyValuePair. This is done by using Linq's Select() and ToArray() methods.

ghost commented 11 months ago

Tagging subscribers to this area: @dotnet/area-system-collections See info in area-owners.md if you want to be subscribed.

Issue Details
### Background and motivation SortedList still has value in many scenarios (despite overlap with PriorityQueue and Linq sorting methods). SortedList is preferred over other collections: when most of the data is provided upfront (at the constructor), when performing common list indexing and when duplicate keys are unavoidable. However, no constructor allows for receiving keys and values outside of the one receiving a dictionary (which does not allow for duplicate keys). Therefore, one or two new constructors are recommended for SortedList that can recieve the keys and values without the need for a dictionary as input. ### API Proposal This is an easy modification since the existing constructor (https://github.com/dotnet/runtime/blob/bed122bb15c5136d4f1aa2e8e9c44608dbbebc8f/src/libraries/System.Collections/src/System/Collections/Generic/SortedList.cs#L149) only references the dictionary to get the count and to copy the keys and values arrays. This could call another constructor which either 1) receives the keys and values as separate arguments, 2) receives them as ICollection. So, it is proposed to replace the above-referenced constructor with the following: ```csharp public SortedList(IDictionary dictionary, IComparer? comparer) : this(dictionary.Keys, dictionary.Values, comparer) { } public SortedList(ICollection> keyValuePairs, IComparer? comparer) : this(keyValuePairs?.Count ?? throw new ArgumentNullException(nameof(keyValuePairs)), comparer) { int count = keyValuePairs.Count; if (count != 0) { this.keys = keyValuePairs.Select(kvp=> kvp.Key).ToArray(); this.values = keyValuePairs.Select(kvp => kvp.Value).ToArray(); Debug.Assert(count == this.keys.Length); if (count > 1) { comparer = Comparer; // obtain default if this is null. Array.Sort(keys, values, comparer); for (int i = 1; i != keys.Length; ++i) { if (comparer.Compare(keys[i - 1], keys[i]) == 0) { throw new ArgumentException(SR.Format(SR.Argument_AddingDuplicate, keys[i])); } } } } _size = count; } public SortedList(ICollection keys1, ICollection values1, IComparer? comparer) : this(keys1?.Count ?? throw new ArgumentNullException(nameof(keys1)), comparer) { int count = keys1.Count; if (count != values1.Count) { //I'm currently writing this outside of the the repo and do not have access to the SR class //throw new ArgumentException(SR.Format(SR.Argument_AddingDuplicate, keys[i])); } if (count != 0) { TKey[] keys = this.keys; keys1.CopyTo(keys, 0); values1.CopyTo(values, 0); Debug.Assert(count == this.keys.Length); if (count > 1) { comparer = Comparer; // obtain default if this is null. Array.Sort(keys, values, comparer); for (int i = 1; i != keys.Length; ++i) { if (comparer.Compare(keys[i - 1], keys[i]) == 0) { throw new ArgumentException(SR.Format(SR.Argument_AddingDuplicate, keys[i])); } } } } _size = count; } ``` ### API Usage As an example, I may have a list of items that I will evaluate to a metric upon which I would like to sort. ```csharp var items = new List { 'a', 'b', 'c' }; var keys = items.Select(i => Evaluate(i)).ToList(); // now send both collections to a SortedList. As stated earlier, one compelling use-case is when keys are unavoidably duplicate. var sortedList = new SortedList(keys, items, new NoEqualSortComparer()); ``` ### Alternative Designs n/a ### Risks The risk is that the collection of keys and values may not line up directly. The alternative is to allow the existing constructor to receive ICollection as opposed to IDictionary. This is implemented in the new code above but requires that some changes be made to the constructor body to separate the KeyValuePair. This is done by using Linq's Select() and ToArray() methods.
Author: micampbell
Assignees: -
Labels: `api-suggestion`, `area-System.Collections`
Milestone: -
eiriktsarpalis commented 11 months ago

Wouldn't an IEnumerable parameter suffice in this case? We can always use runtime checks to infer the input size if available.

micampbell commented 11 months ago

I thought of that as well, but was trying to minimize change to constructor body. It's definitely something that could be added. But mainly, I'm pleased to hear of your quick reply and (potential) confirmation that this is a valid request.