dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.06k stars 4.04k forks source link

Collection expressions: use constant indices when populating the span for a List<T> #71183

Open cston opened 11 months ago

cston commented 11 months ago

Originally posted by @Sergio0694 in https://github.com/dotnet/roslyn/issues/70656#issuecomment-1847043026

Just noticed this (sharplab):

public static IList<int> M() => [1, 2, 8, 4];
List<int> list = new List<int>();
CollectionsMarshal.SetCount(list, 4);
Span<int> span = CollectionsMarshal.AsSpan(list);
int num = 0;
span[num] = 1;
num++;
span[num] = 2;
num++;
span[num] = 8;
num++;
span[num] = 4;
num++;
return list;

It seems there's a fair amount of extra IL just to keep track of that index that's incremented every time, which also makes it not a constant (which might be less useful for the JIT). Is there any specific reason why Roslyn is doing this rather than just emitting:

List<int> list = new List<int>();
CollectionsMarshal.SetCount(list, 4);
Span<int> span = CollectionsMarshal.AsSpan(list);
span[0] = 1;
span[1] = 2;
span[2] = 8;
span[3] = 4;
return list;

The latter might even allow the JIT to vectorize some of these assignments, in theory (also the IL seems smaller too?).

cston commented 11 months ago

There is a similar issue when populating an array or span from a collection expression with spread elements of known length. In those cases, a local is used for all elements, even though we could use constants for the elements before the first spread (for instance, for assigning x and y in the following - see sharplab.io).

static void CreateArray(int x, int y, int[] a, int z)
{
    int[] b = [x, y, ..a, z];
}