dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.38k stars 4.75k forks source link

[API Proposal]: Extend Immutable Collection with Update function #96810

Closed tthiery closed 10 months ago

tthiery commented 10 months ago

Background and motivation

When writing reducers on immutable data we often update an item in a sub collection. Take the following example

var updatedLecture = oldLecture with {
    Students = oldLecture.Students.Select(s => s.Name == "Thomas" ? s with { Attendance = s.Attendance + 1  }: s).ToImmutableArray(),
};

This is (educated guess) not very performant (rebuilding the entire collection via a LINQ enumeration) but also the alternative immutable-native Replace(oldValue, newValue) call is much uglier to write when writing it for deep nesting with intermediate names.

var updatedLecture = oldLecture with {
    Students = oldLecture.Students.Replace(oldLecture.Studenty.First(s => s.Name =="Thomas"), oldLecture.Studenty.First(s => s.Name =="Thomas") is var student ? student with { Attendance = student.Attendance + 1 } : student)
};

This API proposal asks to add a Update and AddOrUpdate method (there are surely better names) to the immutable collection simplifying the above code. I believe, the with statements, the C# 12 collection literals and immutable arrays have huge potential for reducer heavy systems like middlewares or state managers, and writing the most simple, most readable and productive code.

API Proposal

On the sample of ImmutableArray:

namespace System.Collections.Immutable;

public class ImmutableArray<T>
{
    public ImmutableArray<T> Update(Func<T, bool> predicate, Func<T, T> update);
}

Related additions to other collection types are applicable but also for other create operations (as in AddOrUpdate(predicate, item)).

API Usage

var updatedLecture = oldLecture with {
    Students = oldLecture.Students.Update(s => s.Name == "Thomas", item => item with { Attendance = item.Attendance + 1 }),
};

This shows the following benefits

Alternative Designs

Every code using with or immutable collections having an extension class / helper nuget.

    public static ImmutableArray<T> Update<T>(this ImmutableArray<T> self, Func<T, bool> predicate, Func<T, T> updater)
    {
        if (self.FirstOrDefault(predicate) is not null and var oldValue)
        {
            return self.Replace(oldValue, updater(oldValue));
        }

        return self;
    }

Risks

No response

ghost commented 10 months ago

Tagging subscribers to this area: @dotnet/area-system-collections See info in area-owners.md if you want to be subscribed.

Issue Details
### Background and motivation When writing reducers on immutable data we often update an item in a sub collection. Take the following example ````csharp var updatedLecture = oldLecture with { Students = oldLecture.Students.Select(s => s.Name == "Thomas" ? s with { Attendance = s.Attendance + 1 }: s).ToImmutableArray(), }; ```` This is (educated guess) not very performant (rebuilding the entire collection via a LINQ enumeration) but also the alternative immutable-native `Replace(oldValue, newValue)` call is much uglier to write when writing it for **deep** nesting with intermediate names. ````csharp var updatedLecture = oldLecture with { Students = oldLecture.Students.Replace(oldLecture.Studenty.First(s => s.Name =="Thomas"), oldLecture.Studenty.First(s => s.Name =="Thomas") is var student ? student with { Attendance = student.Attendance + 1 } : student) }; ```` This API proposal asks to add a `Update` and `AddOrUpdate` method (there are surely better names) to the immutable collection simplifying the above code. I believe, the `with` statements, the C# 12 collection literals and immutable arrays have huge potential for reducer heavy systems like middlewares or state managers, and writing the most simple, most readable and productive code. ### API Proposal On the sample of ImmutableArray: ```csharp namespace System.Collections.Immutable; public class ImmutableArray { public ImmutableArray Update(Func predicate, Func update); } ``` Related additions to other collection types are applicable but also for other create operations (as in `AddOrUpdate(predicate, item)`). ### API Usage ```csharp var updatedLecture = oldLecture with { Students = oldLecture.Students.Update(s => s.Name == "Thomas", item => item with { Attendance = item.Attendance + 1 }), }; ``` This shows the following benefits - Less code to write - Less ackward/strange/boilerplate code to write - The closures of the update lambda adds a parameter on each level we do this. - The with statement keeps it lean syntax even when used with collections (the new collection syntax and spread operator are unfortunately not useful here) since they again would require first identifying a sub range. ### Alternative Designs Every code using `with` or immutable collections having an extension class / helper nuget. ````csharp public static ImmutableArray Update(this ImmutableArray self, Func predicate, Func updater) { if (self.FirstOrDefault(predicate) is not null and var oldValue) { return self.Replace(oldValue, updater(oldValue)); } return self; } ```` ### Risks _No response_
Author: tthiery
Assignees: -
Labels: `api-suggestion`, `area-System.Collections`
Milestone: -
eiriktsarpalis commented 10 months ago

This is (educated guess) not very performant (rebuilding the entire collection via a LINQ enumeration)

Have you tried benchmarking the two approaches? LINQ does perform a number of performance optimizations under the hood when the source enumerable is ICollection so the difference in performance might be surprising.

This shows the following benefits

This seems to be calibrated to the particular use case, i.e. running a select that only impacts the first element matching a predicate. It doesn't seem like a common enough scenario for us to consider adding a built-in method that does it -- a Select method is much more agile in that regard.

As a more general point, we avoid authoring Linq-style APIs for collection types other than IEnumerable (e.g. array, list, immutable collections, IAsyncEnumerable), however the NuGet ecosystem is full of such libraries you could try evaluating.

tthiery commented 10 months ago

I hear you. My feedback on that: The purpose of immutable collections (to my understanding) is to support an immutable collection surface where editing creates well-engineered clones of the initial collection. The with keyword on records has similar characteristics and even made it into the language. When covering traditional CRUD cases, we have a lot of support for Create (Add/Append, InsertAt, AddRange, ...), some for Delete (Delete, DeleteAll) and one for Update (Replace).

About your concern about being calibrated to a use case: Immutable collections are all about control of changing them. So they are type-casted by itself for infrastructure to control manipulation of complex state like

I guess a double digit percentage of users of immutable collection is facing code as seen above.

If you like, I can update the proposal to work more like DeleteAll by renaming Update it to ReplaceAll and change the semantics a bit. This would also work beautifully for me.

Anecdotally I have stopped working with Immutable Collection because of the boilerplate involved with it until C# 12 came out two months ago with the collection initializer syntax compatible with Immutable collections. It is now interesting again.

eiriktsarpalis commented 10 months ago

The purpose of immutable collections (to my understanding) is to support an immutable collection surface where editing creates well-engineered clones of the initial collection.

You can create clones of mutable or even read-only collections as well (e.g. via the LINQ APIs), however the real purpose of immutable collections is to allow persistable updates, in other words supporting constant-time updates in a way such that the original collection isn't mutated. ImmutableArray is an outlier in that regard, since it's the only collection in that namespace that doesn't support persistable updates: a full-blown copy needs to be created for every update that you make to it.

You could try using ImmutableList<T> which is also a list-like data structure that does permit persistable updates, at the expense of performance (it uses a tree representation to support this). What I would recommend though is use the ImmutableArray<T>.Builder that allows mutable updates on an intermediate instance before it can be re-converted into a fresh immutable instance. All immutable collections come with builder types that you can use.