dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.1k stars 4.04k forks source link

Roslyn could generate better range slicing IL when consts are involved #43598

Open stephentoub opened 4 years ago

stephentoub commented 4 years ago

Version Used: https://github.com/dotnet/roslyn/commit/441c154b27eafc9ed04feda1786f0f07ce448f14

To preserve side-effecting behavior, avoid introducing duplicate operations, and maintain order of operations, Roslyn generates temporaries for the lower and upper bounds of a range, e.g.

using System;
public class C {
    public Span<int> M(Span<int> s) => s[Compute1()..Compute2()];
}

is compiled to the equivalent of:

public Span<int> M(Span<int> s)
{
    Span<int> span = s;
    int num = Compute1();
    int length = Compute2() - num;
    return span.Slice(num, length);
}

However, such temps aren't necessary when Roslyn can prove there are no such issues with the bounds. For example, for the following where both bounds are const:

public Span<int> M(Span<int> s) => s[1..3];

it currently generates the equivalent of:

public Span<int> M(Span<int> s)
{
    Span<int> span = s;
    int num = 1;
    int length = 3 - num;
    return span.Slice(num, length);
}

as:

        IL_0000: ldarg.1
        IL_0001: stloc.0
        IL_0002: ldc.i4.1
        IL_0003: stloc.1
        IL_0004: ldc.i4.3
        IL_0005: ldloc.1
        IL_0006: sub
        IL_0007: stloc.2
        IL_0008: ldloca.s 0
        IL_000a: ldloc.1
        IL_000b: ldloc.2
        IL_000c: call instance valuetype [System.Private.CoreLib]System.Span`1<!0> valuetype [System.Private.CoreLib]System.Span`1<int32>::Slice(int32, int32)
        IL_0011: ret

when it could instead just generate the equivalent of:

public Span<int> M(Span<int> s)
{
    return s.Slice(1, 2);
}

aka

        IL_0000: ldarga.s s
        IL_0002: ldc.i4.1
        IL_0003: ldc.i4.2
        IL_0004: call instance valuetype [System.Private.CoreLib]System.Span`1<!0> valuetype [System.Private.CoreLib]System.Span`1<int32>::Slice(int32, int32)
        IL_0009: ret

This pattern of both bounds being a const shows up when parsing known formats with known offsets.

The same holds when just the lower bound is a const, e.g.

public Span<int> M(Span<int> s) => s[0..Compute2()];

the compiler is currently generating:

    public Span<int> M(Span<int> s)
    {
        Span<int> span = s;
        int num = 0;
        int length = Compute2() - num;
        return span.Slice(num, length);
    }

when it could instead generate the equivalent of:

    public Span<int> M(Span<int> s)
    {
        return s.Slice(1, Compute2() - 1);
    }

This pattern shows up when trimming off the end of something.

It's also possible to do better when using a negative offset index. For example:

public Span<int> M(Span<int> s) => s[1..^1];

currently results in the equivalent of:

    public Span<int> M(Span<int> s)
    {
        Span<int> span = s;
        int length = span.Length;
        int num = 1;
        int length2 = length - 1 - num;
        return span.Slice(num, length2);
    }

but could instead produce:

    public Span<int> M(Span<int> s)
    {
        return s.Slice(1, s.Length - 2);
    }

This pattern shows up when trimming things off both ends, such as quotes off of a string.

jcouv commented 4 years ago

Tagging @agocke

jaredpar commented 3 years ago

Bringing over a case from #54659 that is compelling here:

using System;
public class C {
    public Span<char> M0(Span<Char> s, int pos) => s[..pos];
    public Span<char> M1(Span<Char> s, int pos) => s[0..pos];
    public Span<char> M2(Span<Char> s, int pos) => s.Slice(0, pos);
}

These should produce identical IL but do not because we end up spilling to temporaries even though there are no side effects here.

JimBobSquarePants commented 2 years ago

I just implemented range based slicing in dozens of places in the ImageSharp codebase as a result of the default analysis suggestion. It was a shock to discover that the codegen was less optimal. I would imagine many others in the wild would find this surprising also.

alrz commented 1 year ago

This is fixed for the most part already (likely as part of https://github.com/dotnet/roslyn/pull/57535)

But it still captures the receiver even if it's being used once if that at all matters. (sharplab.io)