dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.27k stars 4.73k forks source link

JIT ARM32: LSRA does not kill R2R indirection cell reg on calls using it #70552

Open jakobbotsch opened 2 years ago

jakobbotsch commented 2 years ago

On ARM32 VSD calls and R2R calls pass an indirection cell in R4 which is a callee-saved register. This is unlike other platforms where a volatile register is used instead. Because of this there has to be special handling in getKillSetForCall, but currently we only have such handling for VSD calls: https://github.com/dotnet/runtime/blob/9dfbc5bd7c8c9f0cb8a5973b7bdbe5d67070feac/src/coreclr/jit/lsrabuild.cpp#L895-L900

We work around this issue in morph by locking in the register requirement on the node itself and marking the constant as no-CSE: https://github.com/dotnet/runtime/blob/9dfbc5bd7c8c9f0cb8a5973b7bdbe5d67070feac/src/coreclr/jit/morph.cpp#L2113-L2120

When I have tried fixing the problem and removing the morph workaround I have seen som large-ish crossgen ARM regressions, so some more investigation is needed.

category:correctness theme:register-allocator

ghost commented 2 years ago

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

Issue Details
On ARM32 VSD calls and R2R calls pass an indirection cell in R4 which is a callee-saved register. This is unlike other platforms where a volatile register is used instead. Because of this there has to be special handling in `getKillSetForCall`, but currently we only have such handling for VSD calls: https://github.com/dotnet/runtime/blob/9dfbc5bd7c8c9f0cb8a5973b7bdbe5d67070feac/src/coreclr/jit/lsrabuild.cpp#L895-L900 We work around this issue in morph by locking in the register requirement on the node itself and marking the constant as no-CSE: https://github.com/dotnet/runtime/blob/9dfbc5bd7c8c9f0cb8a5973b7bdbe5d67070feac/src/coreclr/jit/morph.cpp#L2113-L2120 When I have tried fixing the problem and removing the morph workaround I have seen som large-ish crossgen ARM regressions, so some more investigation is needed.
Author: jakobbotsch
Assignees: -
Labels: `area-CodeGen-coreclr`
Milestone: -
jakobbotsch commented 2 years ago

cc @kunalspathak

jakobbotsch commented 2 years ago

I was originally fixing this as part of #70491, but the diffs on ARM looked like:

libraries.crossgen2.Linux.arm.checked.mch:

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 33123356 (overridden on cmd)
Total bytes of diff: 33176880 (overridden on cmd)
Total bytes of delta: 53524 (0.16 % of base)
    diff is a regression.
    relative diff is a regression.

I reverted the fix in https://github.com/dotnet/runtime/pull/70491/commits/6ef03b4d112b8915a41acf54cdd10a1c7bd3f3b9.