dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.55k stars 4.54k forks source link

ILLink should error out when failing to resolve assemblies/types/members #103831

Open sbomer opened 2 weeks ago

sbomer commented 2 weeks ago

Currently ILLink has a relaxed default that makes a best-effort attempt to continue when it fails to resolve assemblies/types/members. This leads to bugs like https://github.com/dotnet/runtime/issues/93797. We should consider making the resolution behavior stricter.

Historically, the relaxed behavior (--skip-unresolved true) was useful in combination with assembly-level trimming:

When using aggressive trimming, I don't see any good reason for the relaxed behavior (--skip-unresolved true).

I propose we make the following changes:

  1. Make the relaxed mode (--skip-unresolved true) strict for everything but type forwarders.
  2. Change the default to be strict (--skip-unresolved false) when using TrimMode=link

I would make the change early in .NET 10 to give us time to respond to breaks. https://github.com/dotnet/runtime/pull/91007 has additional context describing Unity's use of these options.

@vitek-karas @MichalStrehovsky @dotnet/illink

dotnet-policy-service[bot] commented 2 weeks ago

Tagging subscribers to this area: @agocke, @sbomer, @vitek-karas See info in area-owners.md if you want to be subscribed.

sbomer commented 2 weeks ago

Last time we talked about this we noted:

The potential benefit of being stricter is that it could turn some runtime exceptions into build-time warnings/errors if you're trimming. But our philosophy behind trimming is that it should aim to warn about differences in behavior introduced by trimming, so producing these warnings isn't necessarily desirable for us.

While it would be nice to limit errors to the cases where trimming introduces behavior differences, I think it is not worth the complexity of correctly handling dangling references. Unity tried to do this and this was their experience:

Historically, in UnityLinker we hooked into the HandleUnresolvedMethod and HandleUnresolvedType calls and used that to record things that were missing so that UnityLinker could recreate these things. The history and motivation behind this predates me, but the impact was that problems with missing types and methods because runtime problems instead of build time problems. Which is more in line with the behavior you'd get if you had out of sync assemblies in a normal console app and ran it.

This scheme was unsustainable. It many cases you can't correctly recreate missing things. We won't be doing this going forward with CoreCLR bcl.

sbomer commented 2 weeks ago

@mrvoorhe last time we discussed this you mentioned:

I was thinking of special casing the two missing assemblies that the mscorlib shim references. I forget how, but I was ultimately able to get our linker tests green without having to resort to that.

Those two assemblies may not be enough. I may have to do what you suggest and treat missing references in forwarders specially. We'll see what happens.

I'm curious if you have any more insights, and whether you ended up treating forwarders specially.

MichalStrehovsky commented 1 week ago

Native AOT has similar code to be able to deal with missing assembly references. We regularly get bug reports when someone tries to compile something with missing references and it hits a code path we don't handle well. I even saw at least one NuGet package that was explicitly authored to have missing dependencies (the dependency was behind a lightup).

So this is purely a tradeoff between extending the funnel of "compatible-with-trimming" code and engineering improvement from not having to deal with missing references. In the native AOT compiler case, this is in a mostly leafy place (we basically just add try/catch in places that are recoverable), so I've never considered removing it. I don't know in details how pervasive it is in the illinker codebase so I can't give much of an opinion.

mrvoorhe commented 1 week ago

@mrvoorhe last time we discussed this you mentioned:

I was thinking of special casing the two missing assemblies that the mscorlib shim references. I forget how, but I was ultimately able to get our linker tests green without having to resort to that. Those two assemblies may not be enough. I may have to do what you suggest and treat missing references in forwarders specially. We'll see what happens.

I'm curious if you have any more insights, and whether you ended up treating forwarders specially.

@sbomer Missing forwarders in the bcl caused some pain. I didn't think about forgiving forwarder assembly resolution failures across the board like you've mentioned. What we have currently is this ugly list of assembly names not to error on if we fail to resolve them.

public static bool IsAssemblyKnownToBeReferencedByShimsAndNotIncludedInBCL(string assemblyName)
    {
        switch (assemblyName)
        {
            // These are .NET Framework assemblies that are referenced by the mscorlib.dll shim in .NET but don't exist in the BCL
            case "System.Security.Permissions":
            case "System.Threading.AccessControl":
            case "System.CodeDom":
            case "Microsoft.Win32.SystemEvents":
            case "System.Configuration.ConfigurationManager":
            case "System.Diagnostics.PerformanceCounter":
            case "System.Diagnostics.EventLog":
            case "System.IO.Ports":
            case "System.Windows.Extensions":
            case "System.Drawing.Common":

            // Referenced by System.Data
            case "System.Data.SqlClient":
            case "System.Data.Odbc":
            case "System.Data.OleDb":

            // Referenced by System.Security
            case "System.Security.Cryptography.Pkcs":
            case "System.Security.Cryptography.ProtectedData":
            case "System.Security.Cryptography.Xml":

            // Referenced by System.Runtime.Serialization
            case "System.Runtime.Serialization.Schema":

            // Referenced by System.ServiceModel.Web
            case "System.ServiceModel.Syndication":

            // Referenced by System.ServiceProcess
            case "System.ServiceProcess.ServiceController":

            // Referenced by WindowsBase
            case "System.IO.Packaging":

                return true;
        }

        return false;
    }
mrvoorhe commented 1 week ago
  1. Change the default to be strict (--skip-unresolved false) when using TrimMode=link

I'm OK with that change. That said, a couple thoughts.

I would setup some projects that reference and use the .NET Framework shim assemblies. I had to add that IsAssemblyKnownToBeReferencedByShimsAndNotIncludedInBCL method in order to get UnityLinker to process all of the assemblies in the BCL without crashing when --skip-unresolved false. Maybe the MSBuild + ILLink setup won't have the same problems. I assume those missing forwarders have nuget assemblies that will have been pulled down so that illink doesn't encounter so man missing forwarders?

We don't have UnityLinker w/ CoreCLR running in Unity yet. So I have no new data on my point of concern about how problematic --skip-unresolved false could be. We have UnityLinker processing coreclr assemblies in our linker tests and IL2CPP tests, but that's a far cry from Unity projects with all sorts of code gen and complex setups that could produce assemblies with missing types & methods.

  1. Make the relaxed mode (--skip-unresolved true) strict for everything but type forwarders.

What's the motivation for changing the behavior of --skip-unresolved true? In my experience (which is in the context of Unity, which means no nuget support, .NET Framework assemblies) the unresolved errors are generally trigger by either a) Invalid IL generated by some code generation tool. The code is just flat out wrong. Maybe a TypeReference points to something that doesn't exist. b) A user has a precompiled managed assembly they dropped in their project years ago and eventually they update a dependency and now that precompiled assembly points to something that no longer exists.

I don't think I've seen a case where a forwarder assembly was missing. Granted without nuget and with .NET Framework assemblies, the forwarder scene is less common.

My concern is, if the goal of leaving --skip-unresolved true in place is as a fallback for anyone who has problems with the new default behavior of --skip-unresolved false, then changing the behavior of --skip-unresolved true to still error on missing types/methods could greatly reduce the utility of --skip-unresolved true as a fallback.

On the other hand, if your goal of changing the behavior of --skip-unresolved true is to stomp out unresolved types/methods and not allow them past across the board, then the behavior change makes sense. Whether or not that choice would come back to cause a lot of pain, I don't know 😄

mrvoorhe commented 1 week ago

I don't want to bring back Unresolved Stubbing to UnityLinker.

And I don't really want to have to modify IL2CPP to handle unresolved types & methods. That's essentially the same game as unresolved stubbing. What can you band-aid together and deferring to a runtime exception? And what's impossible to infer enough context that you can generate C++ that will compile? I don't want to go down that road.

If Microsoft wants to push the ecosystem toward not tolerating unresolved types & methods, I'm all for it.