dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.55k stars 4.54k forks source link

Collection<T> and ObservableCollection<T> do not support ranges #18087

Open robertmclaws opened 7 years ago

robertmclaws commented 7 years ago

Update 10/04/2018

@ianhays and I discussed this and we agree to add this 6 APIs for now:

    // Adds a range to the end of the collection.
    // Raises CollectionChanged (NotifyCollectionChangedAction.Add)
    public void AddRange(IEnumerable<T> collection) => InsertItemsRange(0, collection);

    // Inserts a range
    // Raises CollectionChanged (NotifyCollectionChangedAction.Add)
    public void InsertRange(int index, IEnumerable<T> collection) => InsertItemsRange(index, collection);

    // Removes a range.
    // Raises CollectionChanged (NotifyCollectionChangedAction.Remove)
    public void RemoveRange(int index, int count) => RemoveItemsRange(index, count);

    // Will allow to replace a range with fewer, equal, or more items.
    // Raises CollectionChanged (NotifyCollectionChangedAction.Replace)
    public void ReplaceRange(int index, int count, IEnumerable<T> collection)
    {
         RemoveItemsRange(index, count);
         InsertItemsRange(index, collection);
    }

    #region virtual methods
    protected virtual void InsertItemsRange(int index, IEnumerable<T> collection);
    protected virtual void RemoveItemsRange(int index, int count);
    #endregion

As those are the most commonly used across collection types and the Predicate ones can be achieved through Linq and seem like edge cases.

To answer @terrajobst questions:

Should the methods be virtual? If no, why not? If yes, how does eventing work and how do derived types work?

Yes, we would like to introduce 2 protected virtual methods to stick with the current pattern that we follow with other Insert/Remove apis to give people hability to add their custom removals (like filtering items on a certain condition).

Should some of these methods be pushed down to Collection?

Yes, and then ObservableCollection could just call the base implementation and then trigger the necessary events.

Let's keep the final speclet at the top for easier search

Speclet (Updated 9/23/2016)

Scope

Modernize Collection<T> and ObservableCollection<T> by allowing them to handle operations against multiple items simultaneously.

Rationale

The ObservableCollection is a critical collection when it comes to XAML-based development, though it can also be useful when building API client libraries as well. Because it implements INotifyPropertyChanged and INotifyCollectionChanged, nearly every XAML app in existence uses some form of this collection to bind a set of objects against UI.

However, this class has some shortcomings. Namely, it cannot currently handle adding or removing multiple objects in a single call. Because of that, it also cannot manipulate the collection in such a way that the PropertyChanged events are raised at the very end of the operation.

Consider the following situation:

This behavior is unnecessary, especially considering that NotifyCollectionChangedEventArgs already has the components necessary to handle firing the event once for multiple items, but that capability is presently not being used at all.

Implementing this properly would allow for better performance in these types of apps, and would negate the need for the plethora of replacements out there (here, here, and here, for example).

Usage

Given the above scenario as an example, usage would look like this pseudocode:

    var observable = new ObservableCollection<SomeObject>();
    var client = new HttpClient();
    var result = client.GetStringAsync("http://someapi.com/someobject");
    var results = JsonConvert.DeserializeObject<SomeObject>(result);
    observable.AddRange(results);

Implementation

This is not the complete implementation, because other *Range functionality would need to be implemented as well. You can see the start of this work in PR dotnet/corefx#10751


    // Adds a range to the end of the collection.
    // Raises CollectionChanged (NotifyCollectionChangedAction.Add)
    public void AddRange(IEnumerable<T> collection)

    // Inserts a range
    // Raises CollectionChanged (NotifyCollectionChangedAction.Add)
    public void InsertRange(int index, IEnumerable<T> collection);

    // Removes a range.
    // Raises CollectionChanged (NotifyCollectionChangedAction.Remove)
    public void RemoveRange(int index, int count);

    // Will allow to replace a range with fewer, equal, or more items.
    // Raises CollectionChanged (NotifyCollectionChangedAction.Replace)
    public void ReplaceRange(int index, int count, IEnumerable<T> collection);

    // Removes any item that matches the search criteria.
    // Raises CollectionChanged (NotifyCollectionChangedAction.Remove)
    // RWM: Excluded for now, will see if possible to add back in after implementation and testing.
    // public int RemoveAll(Predicate<T> match);

Obstacles

Doing this properly, and having the methods intuitively named, could potentially have the side effect of breaking existing classes that inherit from ObservableCollection to solve this problem. A good way to test this would be to make the change, compile something like Template10 against this new assembly, and see if it breaks.


So the ObservableCollection is one of the cornerstones of software development, not just in Windows, but on the web. One issue that comes up constantly is that, while the OnCollectionChanged event has a structure and constructors that support signaling the change for multiple items being added, the ObservableCollection does not have a method to support this.

If you look at the web as an example, Knockout has a way to be able to add multiple items to the collection, but not signal the change until the very end. The ObservableCollection needs the same functionality, but does not have it.

If you look at other extension methods to solve this problem, like the one in Template10, they let you add multiple items, but do not solve the signaling problem. That's because the ObservableCollection.InsertItem() method overrides Collection.InsertItem(), and all of the other methods are private. So the only way to fix this properly is in the ObservableCollection itself.

I'm proposing an "AddRange" function that accepts an existing collection as input, optionally clears the collection before adding, and then throws the OnCollectionChanging event AFTER all the objects have been added. I have already implemented this in a PR dotnet/corefx#10751 so you can see what the implementation would look like.

I look forward to your feedback. Thanks!

robertmclaws commented 7 years ago

@joshfree @Priya91 Since I already have a PR that addresses this issue, is there any way this could be moved up to 1.1?

LanceMcCarthy commented 7 years ago

While you're in there adding an AddRange() method, can you throw an OnPropertyChanged() into the Count property's setter? Thanks :)

thomaslevesque commented 7 years ago

A long time ago I had implemented a RangeObservableCollection with AddRange, RemoveRange, InsertRange, ReplaceRange and RemoveAll. But it turned out that the WPF binding system didn't support CollectionChanged notifications with multiple items (I seem to remember it has been fixed since then, but I'm not sure).

joshfree commented 7 years ago

@Priya91 can you help shepherd this through the API review process http://aka.ms/apireview with @robertmclaws ?

/cc @terrajobst

Priya91 commented 7 years ago

@Priya91 can you help shepherd this through the API review process http://aka.ms/apireview with

Sure.

Priya91 commented 7 years ago

@robertmclaws Can you create an api speclet on this issue, outling the api syntax, like this. Mainly interested in usage scenarios

svick commented 7 years ago

@robertmclaws

Doing this properly, and having the methods intuitively named, could potentially have the side effect of breaking existing classes that inherit from ObservableCollection to solve this problem.

In what situation could it be a breaking change? The only issue I can think of is that it would cause a warning that tells you to use new if you meant to hide a base class member, which would be actually an error with warnings as errors enabled. Is this what you meant? Or is there another case I'm missing?

robertmclaws commented 7 years ago

@svick Could possibly be a runtime problem. If you just upgraded the framework w/o recompiling, I'm not sure exactly how the runtime execution would react. We'd need to test it just to make sure.

svick commented 7 years ago

@robertmclaws I think that could only be a problem if you don't recompile, but you do upgrade a library with the custom type inheriting from ObservableCollection<T>, which removed its version of AddRange() in the new version. But that would be the fault of that library.

Otherwise, adding a new member won't affect how old binaries behave.

Priya91 commented 7 years ago

+1 The api sounds good to me. For manipulating multiple items , along with AddRange, does it provide value to add, InsertRange, RemoveRange, GetRange for the specified usage scenarios?

cc @terrajobst

robertmclaws commented 7 years ago

@svick You are probably right. I personally would want to test the behavior just to be sure we're not breaking anyone... otherwise this would move to a 2.0 release item.

@Priya91 I'm not sure if a GetRange() would be necessary, but InsertRange() and RemoveRange() would be, along with ReplaceRange(), and possible a Clear() method if one is not currently available.

So if we're comfortable with the API, what's the next step? :)

Priya91 commented 7 years ago

Clear is already available. We still haven't gotten the shape of apis to add, if RemoveRange and InsertRange are to be added, then we need these apis added to the speclet. And then we'll mark api-ready-for-review, to be discussed in the next api-review meeting either on tuesday or friday.

robertmclaws commented 7 years ago

OK, I made changes to the speclet. Note that the parameters might change for the actual implementation, but those are what makes the most sense at this particular second. Please LMK if I need to do anything else. Thanks!

Priya91 commented 7 years ago

RemoveRange(int index, int count) instead of RemoveRange(ICollection) ? How does RemoveRange behave when the ICollection elements are duplicated in ObservableCollection

Priya91 commented 7 years ago

count instead of endIndex..

public void ReplaceRange(IEnumerable<T> collection, int startIndex, int count)
Priya91 commented 7 years ago
public void AddRange(IEnumerable<T> collection, bool clearFirst = false) { }
public void InsertRange(IEnumerable<T> collection, int startIndex) { }
public void RemoveRange(int startIndex, int count) { }
public void ReplaceRange(IEnumerable<T> collection, int startIndex, int count) { }
thomaslevesque commented 7 years ago

Basically the signatures should be the same as in List<T>.

I don't think the clearFirst parameter in AddRange is useful, and anyway optional parameters should be avoided in public APIs.

A RemoveAll method would be useful all well, for consistency with List<T>:

public int RemoveAll(Predicate<T> match)
robertmclaws commented 7 years ago

I think RemoveRange(IEnumerable<T> collection) should remain. It would cycle through collection, call IndexOf(item) and then call RemoveAt(index). Duplicates of the same item would also be removed.

@thomaslevesque I have the clearFirst parameter in there specifically because it IS useful, as in I'm using it in production code right now. Consider in UWP apps when you are resetting a UI... if you call Clear() first, it will fire another CollectionChanged event, which is not always desirable.

I'm not against a RemoveAll function.

thomaslevesque commented 7 years ago

Also, the index parameter usually comes first in existing APIs, so InsertRange, RemoveRange and ReplaceRange should be updated accordingly.

And I don't think ReplaceRange needs a count parameter; what should the method do if the count parameter doesn't much the number of items in the replacement collection?

Here's the API as I see it:

public void AddRange(IEnumerable<T> collection) { }
public void InsertRange(int index, IEnumerable<T> collection) { }
public void RemoveRange(int index, int count) { }
public void ReplaceRange(int index, IEnumerable<T> collection) { }
public int RemoveAll(Predicate<T> match)
thomaslevesque commented 7 years ago

@thomaslevesque I have the clearFirst parameter in there specifically because it IS useful, as in I'm using it in production code right now. Consider in UWP apps when you are resetting a UI... if you call Clear() first, it will fire another CollectionChanged event, which is not always desirable.

I'm not sold on it, but hey, it's your proposal, not mine :wink:. At the very least, I think it should a separate overload, rather than an optional parameter.

thomaslevesque commented 7 years ago

@thomaslevesque I have the clearFirst parameter in there specifically because it IS useful, as in I'm using it in production code right now. Consider in UWP apps when you are resetting a UI... if you call Clear() first, it will fire another CollectionChanged event, which is not always desirable.

This makes me think... there are lots of possible combination of changes you might want to do on the collection without triggering events for each one. So instead of trying to think of each case and introduce a new method for each, perhaps we should lean toward a more generic solution. Something like this:

using (collection.DeferCollectionChangedNotifications())
{
    collection.Add(...);  // no event raised
    collection.Add(...); // no event raised
    // ...
} // event raised here for all changes
robertmclaws commented 7 years ago

@thomaslevesque Overload vs optional parameter makes no practical difference to the end user. It's just splitting hairs. Having overloads just adds unnecessary lines of code.

ReplaceRange with a count would remove all items in the given range, and then insert the new items at that point. The counts not matching would be irrelevant.

If the index comes first in existing APIs, then I'm fine with this:

public void AddRange(IEnumerable<T> collection, clearFirst bool = false) { }
public void InsertRange(int index, IEnumerable<T> collection) { }
public void RemoveRange(int index, int count) { }
public void ReplaceRange(int index, int count, IEnumerable<T> collection) { }
public int RemoveAll(Predicate<T> match)
thomaslevesque commented 7 years ago

@thomaslevesque Overload vs optional parameter makes no practical difference to the end user. It's just splitting hairs.

It's not. Optional parameter can cause very real issues when used in public APIs. Read this blog post by @haacked for details.

shmuelie commented 7 years ago

I'm actually liking @thomaslevesque's idea about using a batching class. It's a common pattern, well understood, and makes complex workflows easier.

thomaslevesque commented 7 years ago

ReplaceRange with a count would remove all items in the given range, and then insert the new items at that point. The counts not matching would be irrelevant.

That would be quite inefficient. Removing items would cause all following items to be moved backwards, and inserting new ones would cause them to be moved forward again. The implementation I have in mind would replace each item in-place, without moving anything.

robertmclaws commented 7 years ago

So instead of trying to think of each case and introduce a new method for each, perhaps we should lean toward a more generic solution.

The point of this proposal was to fill in the gaps on the existing implementation, not coming up with a new pattern for people to deal with. I'm not against that proposal, but that's an entirely new piece of functionality that I don't believe should be a part of this discussion.

shmuelie commented 7 years ago

@robertmclaws but since there is no way to currently do bulk operations there isn't a "new pattern"

robertmclaws commented 7 years ago

That would be quite inefficient. Removing items would cause all following items to be moved backwards, and inserting new ones would cause them to be moved forward again. The implementation I have in mind would replace each item in-place, without moving anything.

Why does that matter? Is it a memory allocation issue?

thomaslevesque commented 7 years ago

that's an entirely new piece of functionality

I agree that it should probably be a separate proposal, but it does solve the initial problem you were having.

robertmclaws commented 7 years ago

@robertmclaws but since there is no way to currently do bulk operations there isn't a "new pattern"

There are existing patterns, in about a dozen custom collections across different NuGet packages and what-not that inherit from ObservableCollection to fill in these gaps. The point of this proposal was simply to bring that functionality back into the native class.

thomaslevesque commented 7 years ago

Why does that matter? Is it a memory allocation issue?

That's one of the issues, yes. I'd have to check to be sure, but I think the underlying array is trimmed down when you remove items, and reallocated with a larger size (if necessary) when you add items.

Even if that's not the case, items after the range you're replacing would have to be moved twice, which has no impact on allocations, but makes the CPU do more work that necessary.

robertmclaws commented 7 years ago

It's not. Optional parameter can cause very real issues when used in public APIs. Read this blog post by @haacked for details.

I'm not against it being an overload. But in reality a) this class hasn't changed in years, b) this method would not likely change once implemented, and c) having overload means you would have to have an AddRangeInternal that pushes the reentrancy check to the parent functions, so that you're not checking for it twice or duplicating functionality. I personally would rather have one function that is not likely to change, vs playing XXXRangeInternal gymnastics the code stays DRY... but that is just me.

Regarding ReplaceRange(start, end, collection), I would expect that could also be an overload that would allow people to replace a range of a different size if they so desire. If that is the behavior the developer wants, then the memory/CPU allocations can't really be avoided.

shmuelie commented 7 years ago

There are existing patterns, in about a dozen custom collections across different NuGet packages and what-not that inherit from ObservableCollection to fill in these gaps. The point of this proposal was simply to bring that functionality back into the native class.

Fair point. Thinking on it a bit I think what we should do is implement the bulk grouping like @thomaslevesque suggested and then implement the methods you suggested on top of that (possibly as extension methods)

This actually has me thinking about a different proposal I might make, hmm

thomaslevesque commented 7 years ago

That's one of the issues, yes. I'd have to check to be sure, but I think the underlying array is trimmed down when you remove items, and reallocated with a larger size (if necessary) when you add items.

I just checked, and I was mistaken. The underlying collection is always a List<T>, which doesn't automatically trim down capacity when items are removed. So, your approach wouldn't cause additional allocations (unless you insert more items than you remove, of course).

robertmclaws commented 7 years ago

@thomaslevesque OK, cool. The more I think about it, you can't really replace a range unless you have a source range to replace. That's either going to come with indexes, or a group of existing items. Otherwise, it's just an InsertRange call.

thomaslevesque commented 7 years ago

Regarding ReplaceRange(start, end, collection), I would expect that could also be an overload that would allow people to replace a range of a different size if they so desire.

Yes, our approaches to ReplaceRange are not functionally equivalent.

If that is the behavior the developer wants, then the memory/CPU allocations can't really be avoided.

Indeed. Not sure which approach should be preferred (maybe both could be done, but it might make the API more confusing).

robertmclaws commented 7 years ago

Ok, I think I came up with a way to meet everyone's requests. Please give me a few minutes to amend my speclet.

UPDATE: I was thinking about a boolean flag that let you buffer events, but that would be a pretty significant change to existing functionality, and I don't know if that is a good idea in this proposal. I think we should get these core functions added first, and THEN see if it makes sense to create a way to let the collection buffer events until you ask it specifically to flush them.

robertmclaws commented 7 years ago

Alright, I've updated the speclet with the other functions, implementing clearFirst and ReplaceRange options as overloads. As I just mentioned, I think buffering events would need to be a different proposal, because it would affect how every existing method fires events, which is not a bad idea, but is outside the scope of this proposal.

thomaslevesque commented 7 years ago
public void ReplaceRange(IEnumerable<T> source, IEnumerable<T> replacement) { } //Might not be plausible, but should be attempted.

I'm not sure about this one. It would have to do a linear search for each item in source, which would result in O(m*n) complexity.

The more I think about it, the less I like the idea of ReplaceRange. We already realized that we had very different ideas of how it should work, and anyway, I don't think replacing items by batch like this is a common scenario (at least I can't think of a single time where I needed that).

robertmclaws commented 7 years ago

OK, I removed that signature from the speclet. Everyone happy with what's left?

thomaslevesque commented 7 years ago

OK, I removed that signature from the speclet. Everyone happy with what's left?

LGTM :+1:

Priya91 commented 7 years ago

@robertmclaws What's the purpose of ReplaceRange(int, int, ICollection), we don't have a ReplaceRange on any of the other collection types. Is there a use case for this? If not, then we can defer on this.

robertmclaws commented 7 years ago

I only added it because @thomaslevesque said he had one in his implementation when he first commented, so I wanted to be inclusive. I can't think of an active use case, so we can leave it out. I updated the speclet.

terrajobst commented 7 years ago

We're fine with this APIs shape:

// Inserts a range at the end.
public void AddRange(IEnumerable<T> collection);

// Inserts a range
public void InsertRange(int index, IEnumerable<T> collection);

// Removes a range.
public void RemoveRange(int index, int count);

// Will allow to repalce a range with fewer, equal, or more items.
public void ReplaceRange(int index, int count, IEnumerable<T> collection);

// Raises event with Reset action
public int RemoveAll(Predicate<T> match);

The add with the boolean seems quite weird and like a deviation from the rest of the BCL. If this is needed, you should use collection.ReplaceRange(0, collection.Count, items).

There are still some open issues the design needs to answer as well:

robertmclaws commented 7 years ago
shmuelie commented 7 years ago

@robertmclaws could you make the updated speclet a new post? Makes following the chain easier and less confusing if you've updated it or not yet

robertmclaws commented 7 years ago

@SamuelEnglard I've had to update it about 7 times now... if I did a new post every time, I think it would make the thread worse. I added a Title header with the update date to make things clearer, and make it stand out more.

shmuelie commented 7 years ago

@robertmclaws that works, thanks!

Priya91 commented 7 years ago

@robertmclaws The updated speclet looks good, 2 suggestions:

  1. Please remove implementation of AddRange, have the API listed like other APIs.
  2. The event fired for RemoveAll should be Reset event, and not Remove event - RemoveAll removes items based on a predicate, meaning, the items removed may not be consecutive. The EventArgs for Remove sets the OldItems and OldStartingIndex for tracking the consecutive items. This doesn't apply to RemoveAll.
robertmclaws commented 7 years ago

1) Will do tomorrow AM. 2) So, I'm not sure what the event is doing there is the correct behavior, and I don't believe that Reset is the expected event to be raised. I think "Reset" for most developers would mean the collection is back to the same state it was when it was first instantiated, which would almost always be zero. In this case, the full NotifyCollectionChangedEventArgs constructor can set the OldStartingIndex to -1.

If I saw the Removed event fired, and a starting index set, I personally wouldn't think that meant that the removed items were consecutive. But that's just me. Seems like something that could be solved through documentation.