dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.45k stars 4.76k forks source link

[API Proposal]: Allow collections expression for other collections #108457

Open terrajobst opened 1 month ago

terrajobst commented 1 month ago

Background and motivation

We allowed collection expressions for IImmutableSet<T>, IImmutableQueue<T>, and IImmutableStack<T>.

It seems odd to not support it on these types:

We don't want to special case ISet<T> and IReadOnlySet<T> as the compiler already special cases all the other standard corlib interfaces; we should file a work item to have the compiler handle these as well.

API Proposal

It seems the design of CollectionBuilderAttribute requires a non-generic type; in order to avoid adding a bunch of types we could do the following:

namespace System.Collections.Generic;

public partial class CollectionExtensions
{
    [EditorBrowsable(EditorBrowsableState.Never)]
    public static LinkedList<T> CreateLinkedList<T>(params ReadOnlySpan<T> values);

    [EditorBrowsable(EditorBrowsableState.Never)]
    public static Stack<T> CreateStack<T>(params ReadOnlySpan<T> values);

    [EditorBrowsable(EditorBrowsableState.Never)]
    public static Queue<T> CreateQueue<T>(params ReadOnlySpan<T> values);
}

[CollectionBuilder(typeof(CollectionExtensions), "CreateLinkedList")]
public partial class LinkedList<T>;

[CollectionBuilder(typeof(CollectionExtensions), "CreateStack")]
public partial class Stack<T>;

[CollectionBuilder(typeof(CollectionExtensions), "CreateQueue")]
public partial class Queue<T>;

API Usage

LinkedList<int> values0 = [1, 2, 3];
Stack<int> values1 = [1, 2, 3];
Queue<int> values2 = [1, 2, 3];

Alternative Designs

No response

Risks

No response

stephentoub commented 1 month ago

LinkedList HashSet SortedSet

Those three work just fine today. They don't need special support as they provide IEnumerable<T>/Add implementations and thus already supported collection initializers which then implicitly meant they supported collection expressions.

ISet IReadOnlySet

The compiler today special-cases the core collection interfaces, e.g. IEnumerable<T>, IList<T>, IReadOnlyCollection<T>, IReadOnlyList<T>. I think we should keep the special-casing of these core interfaces together rather than having some on one plan and some on another. Additionally, if we implement these using a params ReadOnlySpan<T> builder, in some situations the compiler will be forced to allocate temporary storage, whereas it could avoid that if it has special knowledge. My preference is for these to be special-cased in the compiler as with the other interfaces.

Queue Set

These are reasonable. We wouldn't need to do anything special for them if they had Add methods... which they do, they're just called Enqueue and Push, respectively, and thus not recognized by the compiler.

eiriktsarpalis commented 1 month ago

LinkedList HashSet SortedSet

Those three work just fine today. They don't need special support as they provide IEnumerable<T>/Add implementations and thus already supported collection initializers which then implicitly meant they supported collection expressions.

Presumably though there is benefit in exposing span factories since they can be presized? Doesn't seem essential in any case.

Stack values5 = [1, 2, 3];

One thing worth calling out about stacks specifically is that this expression would construct a collection that when enumerated returns 3, 2, 1. That's probably obvious assuming the caller is aware that they're constructing a stack, but they might miss that detail in target-typed contexts, e.g. when passing a function parameter.

terrajobst commented 1 month ago

@stephentoub

LinkedList HashSet SortedSet

Those three work just fine today.

Odd; I thought I tested these before. At least LinkedList<T> doesn't seem to work though (tested on .NET 9.0):

LinkedList<int> values0 = [1, 2, 3];

error: 'LinkedList' does not contain a definition for 'Add' and no accessible extension method 'Add' accepting a first argument of type 'LinkedList' could be found (are you missing a using directive or an assembly reference?)

@stephentoub

The compiler today special-cases the core collection interfaces, e.g. IEnumerable<T>, IList<T>, IReadOnlyCollection<T>, IReadOnlyList<T>. I think we should keep the special-casing of these core interfaces together rather than having some on one plan and some on another.

I assume you're suggesting the compiler special cases ISet<T> and IReadOnlySet<T> then?

stephentoub commented 1 month ago

At least LinkedList doesn't seem to work though (tested on .NET 9.0):

You're right; it calls it AddLast.

I assume you're suggesting the compiler special cases ISet and IReadOnlySet then?

Yes.

stephentoub commented 1 month ago

Presumably though there is benefit in exposing span factories since they can be presized?

It can help on that front but hurt on others, e.g. if you write:

IEnumerable<int> source = ...;
HashSet<int> set = [1, 2, .. source];

would having the Create(span) builder force the compiler to allocate an int[] to store all the inputs in order to pass that off to Create?

terrajobst commented 1 month ago

Yeah, I defined as span to let the compiler use a stackallocated temp rather than forcing it to construct a heap allocated array. Not sure that's sufficient. Does it need scopped?

stephentoub commented 1 month ago

Yeah, I defined as span to let the compiler use a stackallocated temp rather than forcing it to construct a heap allocated array. Not sure that's sufficient. Does it need scopped?

It has to be a span right now; that's all the compiler/language recognizes as part of the builder pattern.

It doesn't need to be scoped. Scoped here is implicit as there's nowhere the span could be stored. If the return types here were ref structs, then scoped would be necessary to prevent the span from being stashed into the returned value.

bartonjs commented 1 month ago

Video

namespace System.Collections.Generic;

#if NECESSARY

public partial class CollectionExtensions
{
    [EditorBrowsable(EditorBrowsableState.Never)]
    public static LinkedList<T> CreateLinkedList<T>(params ReadOnlySpan<T> values);

    [EditorBrowsable(EditorBrowsableState.Never)]
    public static Stack<T> CreateStack<T>(params ReadOnlySpan<T> values);

    [EditorBrowsable(EditorBrowsableState.Never)]
    public static Queue<T> CreateQueue<T>(params ReadOnlySpan<T> values);
}

[CollectionBuilder(typeof(CollectionExtensions), "CreateLinkedList")]
public partial class LinkedList<T>;

[CollectionBuilder(typeof(CollectionExtensions), "CreateStack")]
public partial class Stack<T>;

[CollectionBuilder(typeof(CollectionExtensions), "CreateQueue")]
public partial class Queue<T>;

#else

public partial class Queue<T>
{
    public Queue<T>(ReadOnlySpan<T> items);
}

public partial class Stack<T>
{
    public Stack<T>(ReadOnlySpan<T> items);
}

public partial class LinkedList<T>
{
    public LinkedList<T>(ReadOnlySpan<T> items);
}

#endif
Sergio0694 commented 4 days ago

@terrajobst would it be possible to also support System.Collections.ObjectModel.ReadOnlyCollection<T>? It's quite a convenient type to use in MVVM scenarios to bind readonly collections to the UI, and especially on UWP/WinUI 3, it would allow using collection expressions in C# without breaking AOT support for CsWinRT (you can't use collection expressions when the target is a readonly interface type).

Unless I'm missing something, there's no other concrete type one could use in quite the same way here 🥲

The proposal would be like this:

public partial class CollectionExtensions
{
    [EditorBrowsable(EditorBrowsableState.Never)]
    public static ReadOnlyCollection<T> CreateReadOnlyCollection<T>(params ReadOnlySpan<T> values);
}

[CollectionBuilder(typeof(CollectionExtensions), "CreateReadOnlyCollection")]
public partial class ReadOnlyCollection<T>;

I'm happy to create a separate issue if needed, unless we can just easily tie this one to this proposal?

terrajobst commented 3 days ago

A separate proposal would makes because ReadOnlyCollection<T> is a different "breed" of collections. It iself doesn't hold any data, it wraps another collection that is meant to be modified and the consumer can observe them through ReadOnlyCollection<T>. By allowing a collection expression you're essentially making the wrapped collection invisible.

stephentoub commented 3 days ago

ROC is an example of where it'd be nice to have a builder pattern that supports ownership transfer of an array or list. Without that, the compiler is likely to need to allocate storage (unless the size is known at compile time), and then the implementation of the builder will need to allocate the array again. cc: @CyrusNajmabadi

CyrusNajmabadi commented 3 days ago

Sure. We'd be happy if this came over to csharplang as a "future improvements on collection expressions" feature. If you guys are asking, then that carries a lot of weight

Sergio0694 commented 3 days ago

I've created #110161 for now. With the proposed API taking a span, there should be no temporary allocations when initializing a ROC<T> instance from a collection expression for all cases where the compiler can see the exact side in advance (eg. for fixed list of items). Would it be reasonable to have that for starters, and then possibly expand this in the future with some "ownership transfer" feature, if we get that?