JSkimming / Castle.Core.AsyncInterceptor

Library to simplify interception of asynchronous methods
Apache License 2.0
295 stars 42 forks source link

Measure/document performance of interception #31

Open ndrwrbgs opened 6 years ago

ndrwrbgs commented 6 years ago

For uses such as implementing AOP in C#, performance of the library is a major concern. Ideally, it should be possible to use the library in production code (hot path even), but regardless we should tell users what the performance characteristics of the library are (e.g. if every call that is intercepted will hold a thread for the Wait()).

JSkimming commented 6 years ago

It's a great idea.

I've not tried it before but maybe we could use BenchmarkDotNet. Scott Hanselman blogged about it a couple of years ago.

Good point on the need to highlight that there's some use of Wait() which ties up a thread, Also the solution for #28, which waits until proceed is called before returning.

ndrwrbgs commented 6 years ago

I’m using BenchmarkDotNet for another project I’m working on, I’d be happy to look here just opens this to track the work :)

ndrwrbgs commented 5 years ago

As of the latest official release. Worth noting that Ratio is not important here, it should be a static cost (we don't do more work in interception when you do more work inside your methods :) )

BenchmarkDotNet=v0.11.5, OS=
Intel Core i7-6820HQ CPU 2.70GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
  [Host]   : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.3362.0
  ShortRun : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.3362.0

Job=ShortRun  IterationCount=3  LaunchCount=1
WarmupCount=3
Method Mean Error StdDev Ratio RatioSD Rank Gen 0 Gen 1 Gen 2 Allocated
Unwrapped 1.741 ns 0.9615 ns 0.0527 ns 1.00 0.00 1 - - - -
Wrapped 170.679 ns 91.1694 ns 4.9973 ns 98.10 4.30 2 0.0436 - - 184 B
using System;
using System.Threading.Tasks;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Running;
using Castle.DynamicProxy;

namespace ConsoleApp1
{ 
    [RankColumn]
    [MemoryDiagnoser]
    public class Foo
    {
        public interface IRet
        {
            int RetVal();
        }

        private sealed class StaticRet : IRet
        {
            public int RetVal()
            {
                return 1;
            }
        }

        private sealed class NoopAsyncInterceptor : AsyncInterceptorBase
        {
            protected override Task InterceptAsync(IInvocation invocation, Func<IInvocation, Task> proceed)
            {
                return proceed(invocation);
            }

            protected override Task<TResult> InterceptAsync<TResult>(IInvocation invocation, Func<IInvocation, Task<TResult>> proceed)
            {
                return proceed(invocation);
            }
        }

        private IRet unwrapped;
        private IRet wrapped;

        [GlobalSetup]
        public void Setup()
        {
            unwrapped = new StaticRet();
            wrapped = new ProxyGenerator()
                .CreateInterfaceProxyWithTargetInterface<IRet>(
                    unwrapped,
                    new NoopAsyncInterceptor());
        }

        [Benchmark(Baseline = true)]
        public int Unwrapped()
        {
            return unwrapped.RetVal();
        }

        [Benchmark]
        public int Wrapped()
        {
            return wrapped.RetVal();
        }
    }

    internal static class Program
    {
        private static void Main()
        {
            BenchmarkRunner.Run<Foo>(
                DefaultConfig.Instance.With(Job.ShortRun));
        }
    }
}
ndrwrbgs commented 5 years ago

Remove ShortRun to get lower StdDev/Error, similar results:

Method Mean Error StdDev Ratio RatioSD Rank Gen 0 Gen 1 Gen 2 Allocated
Unwrapped 1.938 ns 0.0753 ns 0.1056 ns 1.00 0.00 1 - - - -
Wrapped 178.192 ns 3.5583 ns 5.6439 ns 92.16 6.38 2 0.0436 - - 184 B
Serg046 commented 4 years ago

Thanks for measuring but you actually tested Castle proxy itself. It is more interesting to see the difference between IInterceptor and IAsyncInterceptor as the last one adds more work and uses reflection (afaiu) in some places, especially for generic tasks

Serg046 commented 4 years ago

For example this is what I have on my machine without special preporations (many apps are open etc):

Method Mean Error StdDev Ratio RatioSD Rank Gen 0 Gen 1 Gen 2 Allocated
WrappedWithCastleInterceptor 43.00 ns 0.717 ns 0.598 ns 1.00 0.00 1 0.0153 - - 64 B
WrappedWithAsyncInterceptor 163.37 ns 3.282 ns 6.245 ns 3.87 0.18 2 0.0439 - - 184 B
[RankColumn]
[MemoryDiagnoser]
public class Foo
{
    public interface IRet
    {
        int RetVal();
    }

    private sealed class StaticRet : IRet
    {
        public int RetVal()
        {
            return 1;
        }
    }

    private sealed class NoopInterceptor : IInterceptor
    {
        public void Intercept(IInvocation invocation)
        {
            invocation.Proceed();
        }
    }

    private sealed class NoopAsyncInterceptor : AsyncInterceptorBase
    {
        protected override Task InterceptAsync(IInvocation invocation, Func<IInvocation, Task> proceed)
        {
            return proceed(invocation);
        }

        protected override Task<TResult> InterceptAsync<TResult>(IInvocation invocation, Func<IInvocation, Task<TResult>> proceed)
        {
            return proceed(invocation);
        }
    }

    private IRet _wrappedWithCastleInterceptor;
    private IRet _wrappedWithAsyncInterceptor;

    [GlobalSetup]
    public void Setup()
    {
        var unwrapped = new StaticRet();

        _wrappedWithCastleInterceptor = new ProxyGenerator()
            .CreateInterfaceProxyWithTargetInterface<IRet>(
                unwrapped,
                new NoopInterceptor());

        _wrappedWithAsyncInterceptor = new ProxyGenerator()
            .CreateInterfaceProxyWithTargetInterface<IRet>(
                unwrapped,
                new NoopAsyncInterceptor());
    }

    [Benchmark(Baseline = true)]
    public int WrappedWithCastleInterceptor()
    {
        return _wrappedWithCastleInterceptor.RetVal();
    }

    [Benchmark]
    public int WrappedWithAsyncInterceptor()
    {
        return _wrappedWithAsyncInterceptor.RetVal();
    }
}

internal static class Program
{
    private static void Main()
    {
        BenchmarkRunner.Run<Foo>(DefaultConfig.Instance);
    }
}
Serg046 commented 4 years ago

Yep, as expected:

Method Mean Error StdDev Ratio RatioSD Rank Gen 0 Gen 1 Gen 2 Allocated
CastleInterceptor 48.21 ns 0.977 ns 1.493 ns 1.00 0.00 1 0.0229 - - 96 B
AsyncInterceptor 277.94 ns 5.121 ns 4.540 ns 5.80 0.29 2 0.0381 - - 160 B
public interface IRet
{
    Task<int> RetVal();
}

private sealed class StaticRet : IRet
{
    public Task<int> RetVal()
    {
        return Task.FromResult(1);
    }
}
Serg046 commented 4 years ago

I played a bit with possible ways for async interceptions and realized that AsyncInterceptor is pretty good.

private sealed class MyNoopAsyncInterceptor : IInterceptor
{
    public void Intercept(IInvocation invocation)
    {
        invocation.Proceed();

        if (invocation.ReturnValue is Task task) invocation.ReturnValue = NewTask(task);

        async Task NewTask(Task t)
        {
            await t;
        }
    }
}
Method Mean Error StdDev Rank Gen 0 Gen 1 Gen 2 Allocated
CastleInterceptor 50.92 ns 1.024 ns 1.257 ns 1 0.0229 - - 96 B
MyAsyncInterceptor 157.66 ns 3.067 ns 3.767 ns 2 0.0229 - - 96 B
AsyncInterceptor 192.62 ns 4.175 ns 7.202 ns 3 0.0305 - - 128 B

It seems the main difference is additional await operator in the chain. It affects the values so much just because BenchmarkDotNet performs the benchmark multiple times, so it is like for (var i = 0; i < 100; i++) WrappedWithAsyncInterceptor() instead of just WrappedWithAsyncInterceptor() per each isolated run actually. But if you change MyAsyncInterceptor so that it doesn't await and just call additional function, the result will be near to CastleInterceptor:

private sealed class MyNoopAsyncInterceptor : IInterceptor
{
    public void Intercept(IInvocation invocation)
    {
        invocation.Proceed();

        if (invocation.ReturnValue is Task task) invocation.ReturnValue = NewTask(task);

        Task NewTask(Task t)
        {
            return t;
        }
    }
}

Eventually, I want to thank you guys for the good lib.