jbevain / cecil

Cecil is a library to inspect, modify and create .NET programs and libraries.
MIT License
2.76k stars 627 forks source link

Reflection & System.Runtime [& self references?] #895

Open jonpryor opened 1 year ago

jonpryor commented 1 year ago

In the "spirit" of #524 and #646 and probably others…

Under .NET, Is there a "correct" way to use DefaultReflectionImporter and have the output assembly reference System.Runtime and not System.Private.CoreLib?

Code of interest:

delegateDef.BaseType = module.ImportReference (typeof (MulticastDelegate));

Example: cecil-import-reflection.zip

Consider the cecil-import-reflection example:

% dotnet build
% ikdasm -assemblyref bin/Debug/net7.0/cecil-import-reflection.dll | grep Name=
    Name=System.Runtime
    Name=System.Runtime.Loader
    Name=Mono.Cecil
    Name=System.Console

Note that the default dotnet build output references System.Runtime. System.Private.CoreLib doesn't make an appearance.

Let's run the example, which uses Cecil to read an input assembly, add a new delegate type to the assembly, and write it out:

% dotnet run bin/Debug/net7.0/cecil-import-reflection.dll out.dll
% ikdasm -assemblyref out.dll | grep Name=
    Name=System.Runtime
    Name=System.Runtime.Loader
    Name=Mono.Cecil
    Name=System.Console
    Name=cecil-import-reflection
    Name=System.Private.CoreLib

Note that System.Private.CoreLib now exists as an assembly reference. (Also note that cecil-import-reflection is an assembly reference! See "Question 2", below.)

Question 1: Is there a way to not have System.Private.CoreLib added as an assembly reference? I tried playing around with a DefaultReflectionImporter subclass and had no luck with that, for reasons I wasn't able to understand.

What I did have luck with was:

  1. Write the assembly to a stream/disk.
  2. Read (1) via AssemblyDefinition.ReadAssembly()
  3. Modify AssemblyDefinition.MainModule.AssemblyReferences and AssemblyDefinition.MainModule.MemberReferences so that .Scope uses System.Runtime.

Something like:

static void WriteKludge (AssemblyDefinition assemblyDef, string path, bool keepIntermediate)
{
    var c       = new MemoryStream ();
    assemblyDef.Write (c);
    c.Position  = 0;

    if (keepIntermediate) {
        using var intermediate = File.Create (path + ".cecil");
        c.WriteTo (intermediate);
        c.Position  = 0;
    }

    var rp = new ReaderParameters {
        InMemory    = true,
        ReadSymbols = false,
        ReadWrite   = true,
    };
    var newAsm              = AssemblyDefinition.ReadAssembly (c, rp);
    var module              = newAsm.MainModule;
    var systemRuntimeRef    = module.AssemblyReferences.FirstOrDefault (r => r.Name == "System.Runtime");
    var privateCorelibRef   = module.AssemblyReferences.FirstOrDefault (r => r.Name == "System.Private.CoreLib");

    if (systemRuntimeRef == null && privateCorelibRef != null) {
        throw new NotSupportedException ("Don't support assemblies which only reference System.Private.CoreLib and not System.Runtime.");
    }

    var selfRef             = module.AssemblyReferences.FirstOrDefault (r => r.Name == newAsm.Name.Name);
    foreach (var member in module.GetMemberReferences ()) {
        if (member.DeclaringType.Scope == privateCorelibRef) {
            member.DeclaringType.Scope = systemRuntimeRef;
            continue;
        }
        if (member.DeclaringType.Scope == selfRef) {
            member.DeclaringType.Scope = null;
            continue;
        }
    }
    foreach (var type in module.GetTypeReferences ()) {
        if (type.Scope == privateCorelibRef) {
            type.Scope = systemRuntimeRef;
            continue;
        }
        if (type.Scope == selfRef) {
            type.Scope = null;
            continue;
        }
    }
    module.AssemblyReferences.Remove (privateCorelibRef);
    if (selfRef != null) {
        module.AssemblyReferences.Remove (selfRef);
    }
    newAsm.Write (path);
}

Is this what I should be doing?

(Aside: I found that the member references & assembly references would change before vs. after AssemblyDefinition.Write(), which implied to me that the collections are "incomplete" until everything is serialized at .Write().)

Question 2: The example also uses Reflection to load a type from the assembly and use that with Cecil. In this particular case, it's creating something equivalent to:

delegate void MyDelegate(MyType type);

where MyType is resolved from the assembly being modified.

The result of this is that out.dll now references cecil-import-reflection, i.e. "itself":

% dotnet run bin/Debug/net7.0/cecil-import-reflection.dll cecil-import-reflection.dll
% ikdasm -assemblyref out.dll | grep Name=
    Name=System.Runtime
    Name=System.Runtime.Loader
    Name=Mono.Cecil
    Name=System.Console
    Name=cecil-import-reflection
    Name=System.Private.CoreLib

Note Name=cecil-import-reflection!

This is mostly "just weird", and WriteKludge() checks for this situation and removes the "self reference".

ltrzesniewski commented 1 year ago

I'm not the Cecil author but I believe I can help here. I see people asking about this here quite often, so I suppose this should be added to the docs.

When you write something like module.ImportReference (typeof (MulticastDelegate)), you're telling Cecil to import the MulticastDelegate type from the runtime the code calling Cecil is running under.

In .NET (Core+), this means you'll get a reference to the runtime assembly defining MulticastDelegate, aka System.Private.CoreLib.

This happens to "work" in your case, but suppose you were editing a .NET Framework assembly from a .NET 7 app using Cecil, you'd still get a reference to System.Private.CoreLib, which wouldn't work under .NET Framework at all.

If you'd like to end up with a reference to System.Runtime, you'll need to use the reference assemblies instead of the implementation assemblies.

Which means you can't use the ImportReference (Type) overload, because reflection will always reference the runtime type. You need to use ImportReference (TypeReference) to do the proper thing.

I believe using ImportReference overloads with the reflection overloads such as Type/MethodReference etc is almost always a mistake, unless you're going to load the modified assembly immediately after you're done editing it. These overloads were OK when the .NET Framework was the only framework around, and the issue I described here simply couldn't occur.

So I indirectly answered your question: you can't use DefaultReflectionImporter since, as its name implies, it uses reflection, which will always get you a reference to the runtime type.

As for question 2, the root cause is the same. You're adding a reference to a type loaded at runtime to the edited assembly. Forget that the reflection overloads of ImportReference exist, and your life will get a lot easier. 🙂

jonpryor commented 1 year ago

While it's good to know that "don't use .ImportReference(System.Reflection.*)!" Is The Way™, additional documentation and examples about how to make that work would be appreciated. How does one import System.Collections.Generic.List<System.Int32>? I don't see a Cecil equivalent to Type.MakeGenericType(Type[]), e.g. TypeReference.MakeGenericType(TypeReference[]) doesn't exist, so how does that work?

Also, using Reflection makes things nice and easy (or was nice and easy, under .NET Framework); is that something we want to lose?

Also also: are there any issues with my WriteKludge() method that would make it non-viable? It appears to work…

ltrzesniewski commented 1 year ago

I don't see a Cecil equivalent to Type.MakeGenericType(Type[]), e.g. TypeReference.MakeGenericType(TypeReference[]) doesn't exist, so how does that work?

You instantiate new GenericInstanceType(TypeReference), then add the type arguments to its GenericArguments list.

Alternatively, the Mono.Cecil.Rocks assembly provides a MakeGenericInstanceType extension method which does this.

Also, using Reflection makes things nice and easy (or was nice and easy, under .NET Framework); is that something we want to lose?

Well, modifying an assembly which targets a runtime A by executing code on a runtime B necessarily makes things a bit more complicated...

Also also: are there any issues with my WriteKludge() method that would make it non-viable? It appears to work…

This code assumes that every type from the System.Private.CoreLib implementation assembly that you reference is declared in the System.Runtime reference assembly. This is not necessarily true. For instance System.Text.UTF8Encoding is declared in the System.Text.Encoding.Extensions reference assembly in .NET 7, not in System.Runtime, even though it's implemented in System.Private.CoreLib.