dotnet / linker

388 stars 127 forks source link

Trimming breaks IBC data #2911

Open sbomer opened 2 years ago

sbomer commented 2 years ago

IBC data appears to contain metadata tokens which are invalidated by trimming. This can cause the R2R compiler to fail as in https://github.com/dotnet/runtime/issues/60343. We should fix this in order for R2R to work with trimmed apps. @vitek-karas suggested removing IBC data from link assemblies.

@agocke FYI, I put this in the .NET 7 milestone.

vitek-karas commented 2 years ago

It's somewhat similar to linker stripping R2R native code. In theory it's possible to keep both R2R native code (and trim it) as well as read/write IBC data and trim it, but that's a LOT of work. For IBC, in the future, if it's beneficial we could try to read/write it to patch the tokens, but it still might be really tricky to do correctly.

sbomer commented 2 years ago

I looked into it a bit with @vitek-karas and cecil already walks the win32 resources section to fix up RVAs for entries in that section: https://github.com/jbevain/cecil/blob/49b1c52df5e8d2bf03768dc35ba6575e88277bc7/Mono.Cecil.PE/ImageWriter.cs#L827-L864

The main issue is that the cecil APIs aren't designed to handle general purpose PE manipulation - this logic and related classes (ImageReader/Writer, Image) are internal implementation details, not part of the public surface area.

So some options to fix this are:

  1. Expose the win32 resources as part of the cecil API surface
  2. Add a cecil feature which specifically removes IBC data
  3. Post-process the linked assembly in a separate pass

I am leaning towards option 3 because exposing PE resources in cecil in a substantial increase in scope, and it doesn't seem like a great API design to do something specific to IBC data since that isn't part of the IL spec. On the other hand, I expect other IL-rewriting tools like Fody could easily hit similar issues, so there are some advantages to doing it inside of cecil.

@vitek-karas pointed out that we will probably need to recompute the MVIDs after removing IBC data.

MichalStrehovsky commented 2 years ago

I'm surprised Cecil doesn't have means to manipulate Win32 resources - CCI has first class support for them within the object model, and System.Reflection.Metadata at least provides hooks on the reading and writing side that one can plug into.

Maybe adding a hook similar to the S.R.Metadata ones would be cheap enough?

The post processing might get tricky if sections shift because we would need to rebuild more than just the resource directory.

sbomer commented 2 years ago

Since this issue is fairly low-impact (we've only had one report of this, and only the compiler nuget packages ship with IBC data), the plan for 7.0 is to not change the linker, but add a flag to crossgen2 to let it skip IBC data as a workaround.

We will explore longer-term solutions for 8.0. It's worth considering a solution that also lets us write apphost win32 resources on all platforms. Writing down some points that came up during discussions for future reference: