Measure/document performance of interception

ndrwrbgs commented 6 years ago

For uses such as implementing AOP in C#, performance of the library is a major concern. Ideally, it should be possible to use the library in production code (hot path even), but regardless we should tell users what the performance characteristics of the library are (e.g. if every call that is intercepted will hold a thread for the Wait()).

JSkimming commented 6 years ago

It's a great idea.

I've not tried it before but maybe we could use BenchmarkDotNet. Scott Hanselman blogged about it a couple of years ago.

Good point on the need to highlight that there's some use of Wait() which ties up a thread, Also the solution for #28, which waits until proceed is called before returning.

ndrwrbgs commented 6 years ago

I’m using BenchmarkDotNet for another project I’m working on, I’d be happy to look here just opens this to track the work :)

ndrwrbgs commented 5 years ago

As of the latest official release. Worth noting that Ratio is not important here, it should be a static cost (we don't do more work in interception when you do more work inside your methods :) )

BenchmarkDotNet=v0.11.5, OS=
Intel Core i7-6820HQ CPU 2.70GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
  [Host]   : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.3362.0
  ShortRun : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.3362.0

Job=ShortRun  IterationCount=3  LaunchCount=1
WarmupCount=3

Method	Mean	Error	StdDev	Ratio	RatioSD	Rank	Gen 0	Gen 1	Gen 2	Allocated
Unwrapped	1.741 ns	0.9615 ns	0.0527 ns	1.00	0.00	1	-	-	-	-
Wrapped	170.679 ns	91.1694 ns	4.9973 ns	98.10	4.30	2	0.0436	-	-	184 B

using System;
using System.Threading.Tasks;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Running;
using Castle.DynamicProxy;

namespace ConsoleApp1
{ 
    [RankColumn]
    [MemoryDiagnoser]
    public class Foo
    {
        public interface IRet
        {
            int RetVal();
        }

        private sealed class StaticRet : IRet
        {
            public int RetVal()
            {
                return 1;
            }
        }

        private sealed class NoopAsyncInterceptor : AsyncInterceptorBase
        {
            protected override Task InterceptAsync(IInvocation invocation, Func<IInvocation, Task> proceed)
            {
                return proceed(invocation);
            }

            protected override Task<TResult> InterceptAsync<TResult>(IInvocation invocation, Func<IInvocation, Task<TResult>> proceed)
            {
                return proceed(invocation);
            }
        }

        private IRet unwrapped;
        private IRet wrapped;

        [GlobalSetup]
        public void Setup()
        {
            unwrapped = new StaticRet();
            wrapped = new ProxyGenerator()
                .CreateInterfaceProxyWithTargetInterface<IRet>(
                    unwrapped,
                    new NoopAsyncInterceptor());
        }

        [Benchmark(Baseline = true)]
        public int Unwrapped()
        {
            return unwrapped.RetVal();
        }

        [Benchmark]
        public int Wrapped()
        {
            return wrapped.RetVal();
        }
    }

    internal static class Program
    {
        private static void Main()
        {
            BenchmarkRunner.Run<Foo>(
                DefaultConfig.Instance.With(Job.ShortRun));
        }
    }
}

ndrwrbgs commented 5 years ago

Remove ShortRun to get lower StdDev/Error, similar results:

Method	Mean	Error	StdDev	Ratio	RatioSD	Rank	Gen 0	Gen 1	Gen 2	Allocated
Unwrapped	1.938 ns	0.0753 ns	0.1056 ns	1.00	0.00	1	-	-	-	-
Wrapped	178.192 ns	3.5583 ns	5.6439 ns	92.16	6.38	2	0.0436	-	-	184 B

Serg046 commented 4 years ago

Thanks for measuring but you actually tested Castle proxy itself. It is more interesting to see the difference between IInterceptor and IAsyncInterceptor as the last one adds more work and uses reflection (afaiu) in some places, especially for generic tasks

Serg046 commented 4 years ago

For example this is what I have on my machine without special preporations (many apps are open etc):

Method	Mean	Error	StdDev	Ratio	RatioSD	Rank	Gen 0	Gen 1	Gen 2	Allocated
WrappedWithCastleInterceptor	43.00 ns	0.717 ns	0.598 ns	1.00	0.00	1	0.0153	-	-	64 B
WrappedWithAsyncInterceptor	163.37 ns	3.282 ns	6.245 ns	3.87	0.18	2	0.0439	-	-	184 B

[RankColumn]
[MemoryDiagnoser]
public class Foo
{
    public interface IRet
    {
        int RetVal();
    }

    private sealed class StaticRet : IRet
    {
        public int RetVal()
        {
            return 1;
        }
    }

    private sealed class NoopInterceptor : IInterceptor
    {
        public void Intercept(IInvocation invocation)
        {
            invocation.Proceed();
        }
    }

    private sealed class NoopAsyncInterceptor : AsyncInterceptorBase
    {
        protected override Task InterceptAsync(IInvocation invocation, Func<IInvocation, Task> proceed)
        {
            return proceed(invocation);
        }

        protected override Task<TResult> InterceptAsync<TResult>(IInvocation invocation, Func<IInvocation, Task<TResult>> proceed)
        {
            return proceed(invocation);
        }
    }

    private IRet _wrappedWithCastleInterceptor;
    private IRet _wrappedWithAsyncInterceptor;

    [GlobalSetup]
    public void Setup()
    {
        var unwrapped = new StaticRet();

        _wrappedWithCastleInterceptor = new ProxyGenerator()
            .CreateInterfaceProxyWithTargetInterface<IRet>(
                unwrapped,
                new NoopInterceptor());

        _wrappedWithAsyncInterceptor = new ProxyGenerator()
            .CreateInterfaceProxyWithTargetInterface<IRet>(
                unwrapped,
                new NoopAsyncInterceptor());
    }

    [Benchmark(Baseline = true)]
    public int WrappedWithCastleInterceptor()
    {
        return _wrappedWithCastleInterceptor.RetVal();
    }

    [Benchmark]
    public int WrappedWithAsyncInterceptor()
    {
        return _wrappedWithAsyncInterceptor.RetVal();
    }
}

internal static class Program
{
    private static void Main()
    {
        BenchmarkRunner.Run<Foo>(DefaultConfig.Instance);
    }
}

Serg046 commented 4 years ago

Yep, as expected:

Method	Mean	Error	StdDev	Ratio	RatioSD	Rank	Gen 0	Gen 1	Gen 2	Allocated
CastleInterceptor	48.21 ns	0.977 ns	1.493 ns	1.00	0.00	1	0.0229	-	-	96 B
AsyncInterceptor	277.94 ns	5.121 ns	4.540 ns	5.80	0.29	2	0.0381	-	-	160 B

public interface IRet
{
    Task<int> RetVal();
}

private sealed class StaticRet : IRet
{
    public Task<int> RetVal()
    {
        return Task.FromResult(1);
    }
}

Serg046 commented 4 years ago

I played a bit with possible ways for async interceptions and realized that AsyncInterceptor is pretty good.

private sealed class MyNoopAsyncInterceptor : IInterceptor
{
    public void Intercept(IInvocation invocation)
    {
        invocation.Proceed();

        if (invocation.ReturnValue is Task task) invocation.ReturnValue = NewTask(task);

        async Task NewTask(Task t)
        {
            await t;
        }
    }
}

Method	Mean	Error	StdDev	Rank	Gen 0	Gen 1	Gen 2	Allocated
CastleInterceptor	50.92 ns	1.024 ns	1.257 ns	1	0.0229	-	-	96 B
MyAsyncInterceptor	157.66 ns	3.067 ns	3.767 ns	2	0.0229	-	-	96 B
AsyncInterceptor	192.62 ns	4.175 ns	7.202 ns	3	0.0305	-	-	128 B

It seems the main difference is additional await operator in the chain. It affects the values so much just because BenchmarkDotNet performs the benchmark multiple times, so it is like for (var i = 0; i < 100; i++) WrappedWithAsyncInterceptor() instead of just WrappedWithAsyncInterceptor() per each isolated run actually. But if you change MyAsyncInterceptor so that it doesn't await and just call additional function, the result will be near to CastleInterceptor:

private sealed class MyNoopAsyncInterceptor : IInterceptor
{
    public void Intercept(IInvocation invocation)
    {
        invocation.Proceed();

        if (invocation.ReturnValue is Task task) invocation.ReturnValue = NewTask(task);

        Task NewTask(Task t)
        {
            return t;
        }
    }
}

Eventually, I want to thank you guys for the good lib.

JSkimming / Castle.Core.AsyncInterceptor

Measure/document performance of interception #31