dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.26k stars 4.73k forks source link

Devirtualization: in need of some clarifications #13450

Open hypeartist opened 5 years ago

hypeartist commented 5 years ago

I can't figure out which devirtualization cases are implemented at the moment of .NET Core 3.0 RC. Here is a quick and dirty example of some common (for me at least) use-cases:

namespace ConsoleTestApp
{
    public static class Test2
    {
        //===== JitDasm target method ==============================
        public static int DoTest(int a, int b, byte c)
        {
            var r1 = DoTestImpl1<int, CalculatorStruct>(a, b, c);
            var r2 = DoTestImpl2<int, CalculatorClass>(a, b, c);
            var r3 = DoTestImpl3(new CalculatorStruct(), a, b, c);
            var r4 = DoTestImpl4<int, CalculatorBase>(new CalculatorClass(), a, b, c);
            var r5 = DoTestImpl5(a, b, c);
            var r6 = DoTestImpl6(a, b, c);

            return r1 + r2 + r3 + r4 + r5 + r6;
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
        private static T DoTestImpl1<T, C>(T a, T b, byte c) 
            where T : unmanaged 
            where C : unmanaged, ICalculator
        {
            C calc = default;
            return calc.Calc(a, b, c);
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
        private static T DoTestImpl2<T, C>(T a, T b, byte c)
            where T : unmanaged
            where C : CalculatorBase, new()
        {
            var calc = new C();
            return calc.Calc(a, b, c);
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
        private static T DoTestImpl3<T, C>(C calc, T a, T b, byte c)
            where T : unmanaged
            where C : unmanaged, ICalculator
        {
            return calc.Calc(a, b, c);
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
        private static T DoTestImpl4<T, C>(C calc, T a, T b, byte c)
            where T : unmanaged
            where C : CalculatorBase
        {
            return calc.Calc(a, b, c);
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
        private static T DoTestImpl5<T>(T a, T b, byte c)
            where T : unmanaged
        {
            var calc = new CalculatorClass();
            return calc.Calc(a, b, c);
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
        private static T DoTestImpl6<T>(T a, T b, byte c)
            where T : unmanaged
        {
            var calc = new CalculatorStruct();
            return calc.Calc(a, b, c);
        }
    }

    //===== Stuff used above ================================
    public interface ICalculator
    {
        T Calc<T>(T a, T b, byte c) where T : unmanaged;
    }

    public struct CalculatorStruct : ICalculator
    {
        [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
        public T Calc<T>(T a, T b, byte c) where T : unmanaged => CalcMethods.Calc(a, b, c);
    }

    public abstract class CalculatorBase
    {
        public abstract T Calc<T>(T a, T b, byte c) where T : unmanaged;
    }

    public sealed class CalculatorClass : CalculatorBase
    {
        [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
        public override T Calc<T>(T a, T b, byte c) => CalcMethods.Calc(a, b, c);
    }

    public static class CalcMethods
    {
        [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
        public static T Calc<T>(T a, T b, byte c) where T : unmanaged
        {
            var t = Mul(Add(a, b), Cast<byte, T>(c));
            return Div(Mul(a, Cast<byte, T>(c)), Shr(t, 2));
        }
    }
}

The conclusion is that the only method that get inlined is DoTestImpl6<T> The rest five methods are ends up with:

*************** in fgTransformIndirectCalls(inlinee)
 -- no candidates to transform
Inlining [000048] failed, so bashing STMT00011 to NOP

INLINER: during 'fgInline' result 'failed this callee' reason 'generic virtual' for 'Test2:DoTest(int,int,ubyte):int' calling 'Test2:DoTestImplX(int,int,ubyte):int'

INLINER: Marking Test2:DoTestImplX(int,int,ubyte):int as NOINLINE because of generic virtual
INLINER: during 'fgInline' result 'failed this callee' reason 'generic virtual'

Is it expected behavior or I am missing something?

full JitDump output

UPD2: And to make things a little more confused here is another test example (this time completely ready-to-run):

class Test1
    { 
        //===== JitDasm target method ==============================
        public static void TestMethodAll(PodPtr<byte> p, byte r, byte g, byte b, byte a, byte c)
        {
            TestMethod2Impl(new BlenderStruct<byte, Rgba8>(), p, r, g, b, a, c);
            TestMethod1Impl(new BlenderClassFromBase<byte, Rgba8>(), p, r, g, b, a, c);
            TestMethod2Impl(new BlenderClassFromInterface<byte, Rgba8>(), p, r, g, b, a, c);
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
        private static void TestMethod1Impl<T>(BlenderBase<T> bl, PodPtr<T> p, T r, T g, T b, T a, byte c)
            where T : unmanaged
        {
            bl.Blend(p, r, g, b, a);
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
        private static void TestMethod2Impl<T>(IBlender<T> bl, PodPtr<T> p, T r, T g, T b, T a, byte c)
            where T : unmanaged
        {
            bl.Blend(p, r, g, b, a);
        }

        //===== Stuff used above ================================
        public interface IBlender<T>
            where T : unmanaged
        {
            void Blend(PodPtr<T> p, T r, T g, T b, T a);
        }

        public abstract class BlenderBase<T>
            where T : unmanaged
        {
            public abstract void Blend(PodPtr<T> p, T r, T g, T b, T a);
        }

        public readonly struct BlenderStruct<T, C> : IBlender<T>
            where T : unmanaged
            where C : unmanaged, IColor<T>
        {
            [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
            public void Blend(PodPtr<T> p, T r, T g, T b, T a)
            {
                OrderBgra order = default;
                C color = default;

                ref var dr = ref p[order.R];
                ref var dg = ref p[order.G];
                ref var db = ref p[order.B];

                dr = color.Lerp(dr, r, a);
                dg = color.Lerp(dg, g, a);
                db = color.Lerp(db, b, a);
            }
        }

        public sealed class BlenderClassFromBase<T, C> : BlenderBase<T>
            where T : unmanaged
            where C : unmanaged, IColor<T>
        {
            [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
            public override void Blend(PodPtr<T> p, T r, T g, T b, T a)
            {
                OrderBgra order = default;
                C color = default;

                ref var dr = ref p[order.R];
                ref var dg = ref p[order.G];
                ref var db = ref p[order.B];

                dr = color.Lerp(dr, r, a);
                dg = color.Lerp(dg, g, a);
                db = color.Lerp(db, b, a);
            }
        }

        public sealed class BlenderClassFromInterface<T, C> : IBlender<T>
            where T : unmanaged
            where C : unmanaged, IColor<T>
        {
            [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
            public void Blend(PodPtr<T> p, T r, T g, T b, T a)
            {
                OrderBgra order = default;
                C color = default;

                ref var dr = ref p[order.R];
                ref var dg = ref p[order.G];
                ref var db = ref p[order.B];

                dr = color.Lerp(dr, r, a);
                dg = color.Lerp(dg, g, a);
                db = color.Lerp(db, b, a);
            }
        }

        public readonly struct OrderBgra
        {
            public int R
            {
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                get => 2;
            }

            public int G
            {
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                get => 1;
            }

            public int B
            {
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                get => 0;
            }

            public int A
            {
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                get => 3;
            }
        }

        public interface IColor<TValue>
            where TValue : unmanaged
        {
            TValue R { get; set; }
            TValue G { get; set; }
            TValue B { get; set; }
            TValue A { get; set; }
            bool IsTransparent { get; }
            bool IsOpaque { get; }
            TValue Invert(TValue x);
            TValue Multiply(TValue a, TValue b);
            TValue Demultiply(TValue a, TValue b);
            TValue MultCover(TValue a, byte b);
            byte ScaleCover(byte a, TValue b);
            TValue Lerp(TValue p, TValue q, TValue a);
            TValue Prelerp(TValue p, TValue q, TValue a);
        }

        public struct Rgba8 : IColor<byte>
        {
            private const int BaseShift = 8;
            private const int BaseScale = 1 << BaseShift;
            private const int BaseMask = BaseScale - 1;
            private const int BaseMsb = 1 << (BaseShift - 1);

            public byte R
            {
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                get;
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                set;
            }

            public byte G
            {
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                get;
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                set;
            }

            public byte B
            {
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                get;
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                set;
            }

            public byte A
            {
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                get;
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                set;
            }

            public bool IsTransparent
            {
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                get => A == 0;
            }

            public bool IsOpaque
            {
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                get => A == 255;
            }

            [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
            public byte Invert(byte x) => (byte) (BaseMask - x);

            [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
            public byte Multiply(byte a, byte b)
            {
                var t = a * b + BaseMsb;
                return (byte) (((t >> BaseShift) + t) >> BaseShift);
            }

            [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
            public byte Demultiply(byte a, byte b)
            {
                if (a * b == 0)
                {
                    return 0;
                }

                if (a >= b)
                {
                    return BaseMask;
                }

                return (byte) ((a * BaseMask + (b >> 1)) / b);
            }

            [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
            public byte MultCover(byte a, byte b) => Multiply(a, b);

            [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
            public byte ScaleCover(byte a, byte b) => Multiply(b, a);

            [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
            public byte Lerp(byte p, byte q, byte a)
            {
                var t = (q - p) * a + BaseMsb - (p > q ? 1 : 0);
                return (byte) (p + (((t >> BaseShift) + t) >> BaseShift));
            }

            [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
            public byte Prelerp(byte p, byte q, byte a) => (byte) (p + q - Multiply(p, a));
#if DEBUG
        public override string ToString() => $"R: {R}, G: {G}, B: {B}, A: {A}";
#endif
        }

        public unsafe struct PodPtr<T>
            where T : unmanaged
        {
            private T* _pointer;

            public ref T this[int pos]
            {
                [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
                get => ref _pointer[pos];
            }

            [MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
            public static PodPtr<T> FromRef(ref T t)
            {
                var tp = (T*) Unsafe.AsPointer(ref t);
                return *(PodPtr<T>*) &tp;
            }
#if DEBUG
        public override string ToString() => $"0x{((IntPtr)_pointer).ToString("x8")}";
#endif
        }
    }

This time we get successfully inlined those two methods:

TestMethod1Impl(new BlenderClassFromBase<byte, Rgba8>(), p, r, g, b, a, c);
TestMethod2Impl(new BlenderClassFromInterface<byte, Rgba8>(), p, r, g, b, a, c);

While the method:

TestMethod2Impl(new BlenderStruct<byte, Rgba8>(), p, r, g, b, a, c);

Fails to get inlined with that reason:

INLINER: during 'impMarkInlineCandidate' result 'failed this callee' reason 'cannot get method info' for 'Test1:TestMethod2Impl(ref,struct,ubyte,ubyte,ubyte,ubyte,ubyte)' calling 'BlenderStruct`2:Blend(struct,ubyte,ubyte,ubyte,ubyte):this'

INLINER: Marking BlenderStruct`2:Blend(struct,ubyte,ubyte,ubyte,ubyte):this as NOINLINE because of cannot get method info
INLINER: during 'impMarkInlineCandidate' result 'failed this callee' reason 'cannot get method info'

Full JitDump output

category:cq theme:devirtualization skill-level:expert cost:medium impact:small

BruceForstall commented 5 years ago

@AndyAyersMS @dotnet/jit-contrib

AndyAyersMS commented 5 years ago

The jit currently won't inline any method that calls a generic virtual method (GVM). I'm not entirely sure why. This restriction may no longer be necessary, or perhaps can be relaxed for some cases.

In code that is performance sensitive, GVMs are usually avoided as calls to GVMs require special runtime support and have high runtime costs. But if everything is instantiated over value types like it is here, I think we can probably do better.

As a workaround, if you can re-express what you're doing without GVMs you should see devirtualization working fairly well for examples like this.

Let me think about this a bit more (it may be a while as I am working through some health issues). In the meantime if you can find more examples of these patterns in your code or elsewhere, please update this issue.

AndyAyersMS commented 5 years ago

Also if you have a more complete example you can point me at, perhaps we can create a benchmark or something similar.