Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.
Issue Details
The memory barrier optimization done in `appendToCurIG` on ARM32/ARM64 looks very expensive throughput wise:
https://github.com/dotnet/runtime/blob/e0a81708fd4f7f57ad85c3d4cffcc3752e5d4046/src/coreclr/jit/emit.cpp#L1518-L1532
We should see if we can switch it to happen on-demand when we are emitting the memory barrier instead.
The memory barrier optimization done in
appendToCurIG
on ARM32/ARM64 looks very expensive throughput wise:https://github.com/dotnet/runtime/blob/e0a81708fd4f7f57ad85c3d4cffcc3752e5d4046/src/coreclr/jit/emit.cpp#L1518-L1532
We should see if we can switch it to happen on-demand when we are emitting the memory barrier instead.