Callsites are not backpatched for large methods in R2R code

dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.

https://docs.microsoft.com/dotnet/core/

MIT License

15.29k stars 4.74k forks source link

Callsites are not backpatched for large methods in R2R code #32845

Open jkotas opened 4 years ago

jkotas commented 4 years ago

Repro:

Save https://gist.github.com/jkotas/2a79b5580127fe2f7ad79580b454eb4c as test.cs
csc /o+ /noconfig /nostdlib /r:System.Private.CoreLib.dll test.cs
crossgen /r:System.Private.CoreLib.dll test.exe
corerun test.exe

Result:

Tirered JITing kicks in for Test<T>() and the method gets JITed succesfully=
The callsite in Main does not get updated. It keeps calling R2R version of the method.

Expected:

The callsite in Main starts calling tried JITed version of the method

jkotas commented 4 years ago

cc @kouvel

kouvel commented 4 years ago

Upon the tier 1 compilation, the JIT switches to MinOpts for the method, probably due to its size. In that case the entry point is not updated to avoid a potential regression, as prejitted code may potentially be better if different policies are used at prejitting time. The JIT traces are currently reflecting this accurately and I have posted PR https://github.com/dotnet/runtime/pull/32928 to fix that. Currently the JitDisasm output shows an indication of this:

; Assembly listing for method My:Test()
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; Tier-1 compilation
; MinOpts code

AndyAyersMS commented 4 years ago

With your fix, do you still see

OPTIONS: Tier-1/FullOpts compilation, switched to MinOpts

That is the important bit.

I guess this is one more avenue for possible R2R -> Tier1 "regressions" -- historically, the jit is willing to spend more time prejitting a method than it would spend jitting, so the optimization circuit-breakers are set lower for jittting (related: #31942).

Perhaps we should reconsider all this, and make sure the Tier1 method gets optimized if the R2R method is optimized?

kouvel commented 4 years ago

Yes I see that line printed instead of "Tier 1", and in the JitDisasm output it no longer prints "Tier-1 compilation" and only the part about MinOpts code.

Perhaps we should reconsider all this, and make sure the Tier1 method gets optimized if the R2R method is optimized?

Currently if the JIT switches to MinOpts during a tier 1 compilation, the VM won't update the entry point and will continue to use prejitted (or tier 0 jitted) code to avoid a regression. It may be useful to force-optimize methods that were prejitted with optimizations, I'm not sure about the tradeoffs between likelihood/frequency of those breakers triggering, perf improvement from optimizing, and CPU time taken to increase the thresholds at run time.

AndyAyersMS commented 4 years ago

I'm not sure either, but Tier0 compilation of large methods often emits a huge amount of code, so the time spent jitting at Tier0 vs Tier1 may be closer than one might imagine. Worth some investigation, anyways.

At any rate, we're now doing a useless Tier0 compilation. Another option is to abandon the Tier1 jit attempt immediately if the jit decides fall back to minopts.

kouvel commented 4 years ago

I'll look into abandoning the JIT more closely later, it would be good to avoid the probably long JIT time and large redundant code