dotnet / linker

389 stars 127 forks source link

[release/7.0] Fix branch removal in compiler generated code #3161

Closed vitek-karas closed 1 year ago

vitek-karas commented 1 year ago

7.0 port of #3088

Customer impact

In the worst case this case cause the trimmer to crash, preventing the app from being trimmed. In other cases this may lead to incorrect warnings or trimming behavior. The bug is order dependent so it's unpredictable (even though it's deterministic), while workaround may be possible by changing order of things, it would be very fragile. See reported issue: https://github.com/dotnet/linker/issues/3157

Risk

While the change affects lot of places in the trimmer, it's mostly mechanical (see below for the technical description of the change). The main idea is to prevent access to Cecil's instructions list directly and always go through a helper which makes sure that all necessary modifications to the instructions of the method were applied first. As such the risk is relatively low, since it doesn't change how most things work, it only changes order - making sure that body modifications happen before the body is inspected for marking.

Testing

This change added targeted tests which reproed the problem and they cover several variations of the problem. The change has been in main (8.0) for more than a month without any known issues. I validated that this change fixes the problem reported in #3157

Technical details about the change

Changes to processing of compiler generated methods lead to a state where we don't call constant prop and branch removal in all cases before we mark instructions of the method. This can lead to overmarking

This change fixes this by making sure that the branch removal executes on the method in all cases before we mark instructions of the method.

The change guarantees that all accesses to Body are after the constant prop/branch removal happened on the method.

This does have one possibly negative impact: the issue described in https://github.com/dotnet/linker/issues/2937 is now consistent and happens always.

Added tests.

Note that there's still a whole in analysis of compiler generated code around state machines, see https://github.com/dotnet/linker/issues/3087

Basically if there's a local function which is going to be removed due to branch removal and if the body of that method contains code which produces a warning due to generic parameter validation, such warning will always be generated even though it's "dead" code and even if it's suppressed via RUC or similar.

In such case the analysis can't figure out to which method the local function belongs (since the call site has been removed).

marek-safar commented 1 year ago

Needs tactics approval