Compiler could generate better IL for collection expressions

Neme12 commented 5 months ago

Version Used: current main on sharplab (19 Jan 2024, it doesn't say the version or commit 😔)

Steps to Reproduce:

For this code:

List<int> M()
{
    return [1, 2, 3, 4];
}

the compiler currently emits the equivalent of this:

List<int> M()
{
    List<int> list = new List<int>();
    CollectionsMarshal.SetCount(list, 4);
    Span<int> span = CollectionsMarshal.AsSpan(list);
    int num = 0;
    span[num] = 1;
    num++;
    span[num] = 2;
    num++;
    span[num] = 3;
    num++;
    span[num] = 4;
    num++;
    return list;
}

(sharplab)

I think it would be better if it generated the equivalent of this instead:

List<int> M()
{
    List<int> list = new List<int>();
    CollectionsMarshal.SetCount(list, 4);
    Span<int> span = CollectionsMarshal.AsSpan(list);
    ref int num = ref MemoryMarshal.GetReference(span);
    num = 1;
    num = ref Unsafe.Add(ref num, 1);        
    num = 2;
    num = ref Unsafe.Add(ref num, 1);        
    num = 3;
    num = ref Unsafe.Add(ref num, 1);
    num = 4;
    num = ref Unsafe.Add(ref num, 1);
    return list;
}

or, as an alternative:

List<int> M()
{
    List<int> list = new List<int>();
    CollectionsMarshal.SetCount(list, 4);
    Span<int> span = CollectionsMarshal.AsSpan(list);
    ref int num = ref MemoryMarshal.GetReference(span);
    Unsafe.Add(ref num, 0) = 1;
    Unsafe.Add(ref num, 1) = 2;
    Unsafe.Add(ref num, 2) = 3;
    Unsafe.Add(ref num, 3) = 4;
    return list;
}

Why? 1) It is more efficient, because the JIT currently doesn't elide range checks in the current codegen. You can see this in the sharplab example.by switching to assembler. This could of course be fixed (and probably should be), but I would argue that the compiler should emit the simpler code anyway, to not provide any chance whatsoever for the JIT to fail to do that. 2) It results in a lot less IL. This is important in today's age when .NET runs in environments like the browser. You can see this on sharplab - 46 lines of IL vs 29 and 27, respectively (and the resulting x64 asm is 55, 47 and 45 lines).

I understand that the compiler doesn't usually do optimizations and leaves that to the JIT, but: 1) In this case, the compiler doesn't need to do any complicated (or even simple) analysis to generate the desired code. It would simply switch from one codegen strategy to another, both of which have equal complexity. 2) One could argue that even the current codegen strategy is an optimization - to generate better code than repeatedly calling list.Add. And the compiler uses the Unsafe.Add approach for collection literals for spans today already.

It would also be nice to get rid of the last, unused num++; or num = ref Unsafe.Add(ref num, 1);. The JIT elides this of course, but as I mentioned, generating less IL is better, and in this case that last increment is simply redundant.

RikkiGibson commented 5 months ago

it doesn't say the version or commit

Clicking the little ^ next to the date will reveals it as d8d7886808a68c720a246b09028f870bc62c30fe

RikkiGibson commented 5 months ago

It is more efficient, because the JIT currently doesn't elide range checks in the current codegen

So for the suggested codegen, is it part of the idea that by using Unsafe APIs on the underlying reference of the Span, there isn't even a range check to elide?

As far as which "variant" of the codegen to choose, a microbenchmark might help.

RikkiGibson commented 4 months ago

I would review a fix if accompanied by a microbenchmark of the current and proposed codegen.

dotnet / roslyn

Compiler could generate better IL for collection expressions #72568