dotnet / BenchmarkDotNet

Powerful .NET library for benchmarking
https://benchmarkdotnet.org
MIT License
10.39k stars 957 forks source link

Isolate InProcess benchmarks #2417

Open timcassell opened 1 year ago

timcassell commented 1 year ago

One major downside of in-process benchmarks is that, if they produce side-effects, they can affect the results of other benchmarks. See #2197

Currently, in-process benchmarks run on their own dedicated thread to get a semblance of isolation, but it doesn't isolate global state.

I think we can obtain full isolation of the benchmarks using AssemblyLoadContext in Core, and AppDomain in Framework. We target netstandard2.0 which only supports AppDomain, but of course it doesn't work in Core. So we'll have to use reflection to use AssemblyLoadContext, but I think it will be worth it. [Edit] Or it looks like we could use System.Runtime.Loader nuget package, or add netcoreapp2.0 target.

adamsitnik commented 1 year ago

I think we can obtain full isolation of the benchmarks using AssemblyLoadContext in Core, and AppDomain in Framework

We could, but the complexity it would bring would be HUGE. And it would be super hard to get it working for all possible scenarios.

Process level isolation is the best thing out there, and we should keep promoting the out-proc toolchain.

cc @jkotas who is .NET Runtime Architect and can say more about isolating code with AppDomains

timcassell commented 1 year ago

Process level isolation is the best thing out there, and we should keep promoting the out-proc toolchain.

I don't disagree, but afaik out-of-process toolchains can't be used on some platforms like Android.

jkotas commented 1 year ago

AssemblyLoadContext or AppDomains give you a semblance of isolation. They do not protect you from process global side-effects. Benchmark authors would still need to be careful about the process global side-effects, just like they need to be careful with in-proc benchmarks today.

timcassell commented 1 year ago

@jkotas If their static variables are reset for each benchmark, how could users leak global state?

jkotas commented 1 year ago

There is a ton of global state throughout the runtime and runtime libraries that is not isolated by AssemblyLoadContext or AppDomains.

One example from many: Benchmark calls GC.AddMemoryPressure without removing it. Benchmarks that will run after it are still going to be under additional GC pressure and their performance characteristics may be significantly altered.

timcassell commented 1 year ago

Interesting. Unity editor has used AppDomain isolation for many years to isolate user code, which seemed to work out quite well. I think, even with its limitations, it should still be good enough for us here. Are there any significant differences between AppDomain isolation and AssemblyLoadContext isolation?

jkotas commented 1 year ago

Unity editor uses AppDomains for assembly load context isolation to allow reloading the user code.

AppDomains do not provided strong isolation from side-effects with perf impact. They may isolate from some side-effects, but there are many holes.

timcassell commented 1 year ago

Thanks. Despite the many holes, I still think it's worth adding this. Some libraries use static object pooling with no explicit way to clear the pools, and assembly load context isolation should at least help with that.

jkotas commented 1 year ago

The perf tests for libraries like that can clear the pools using private reflection or using the new .NET 8 UnsafeAccessor feature.

timcassell commented 1 year ago

That seems unusually cumbersome to someone just trying to evaluate libraries from their public APIs.

timcassell commented 9 months ago

I think DPGO in .Net 8 makes this more relevant. Calling the same method with different arguments may result in different code-gen in benchmark scenarios, but without resetting the assembly for new code-gen, a benchmark could be running assembly code that is not DPGO'd for itself.

adamsitnik commented 9 months ago

In the long term we should be investing in having a source-generator toolchain (https://github.com/dotnet/BenchmarkDotNet/issues/1770), that would simply work everywhere except for F# (it's not using Roslyn).

The idea is quite simple: we generate the boilerplate code in the same project as the benchmarks (so all MSBuild magic works OOTB) and the process is just invoking itself to run the benchmarks (but with a different set of arguments). It would remove the need of having SDK installed. The problem is that it's a huge investment: all the scenarios that are supported by reflection now (detecting all kinds of attributes etc) would need to be supported by the source generator.

timcassell commented 9 months ago

Sure, investing in a source generator toolchain is good, but that's not really relevant to this issue. Even if the source generator supports in-process, the assembly still should be reset if possible.

adamsitnik commented 9 months ago

Sure, investing in a source generator toolchain is good, but that's not really relevant to this issue. Even if the source generator supports in-process, the assembly still should be reset if possible.

Source generator toolchain would solve for good the problem in-process toolchain has been trying to solve. That is why I strongly believe that we should not be investing our time in in-process toolchain.

timcassell commented 9 months ago

Source generator toolchain would solve for good the problem in-process toolchain has been trying to solve. That is why I strongly believe that we should not be investing our time in in-process toolchain.

It solves one problem, but I think some platforms like android and ios don't support launching the process again, so we still need in-process. Which is why I think the source generator should also support in-process.

adamsitnik commented 9 months ago

don't support launching the process again

Then we can make the source generator toolchain start Threads instead of processes. Otherwise we have to maintain few more toolchains and sync every code change with IL-emit part.

timcassell commented 9 months ago

Then we can make the source generator toolchain start Threads instead of processes.

Yeah, we already do that with the existing in-process toolchains. But as I mentioned before, that doesn't provide as much isolation as .Net has available.

Otherwise we have to maintain few more toolchains and sync every code change with IL-emit part.

Are you suggesting we remove the existing in-process toolchains? So in-process won't work with F#?