Washi1337 / AsmResolver

A library for creating, reading and editing PE files and .NET modules.
https://docs.washi.dev/asmresolver/
MIT License
848 stars 127 forks source link

Type comparison fails on two imported types with different scope but forward to the same type #546

Open Washi1337 opened 6 months ago

Washi1337 commented 6 months ago

AsmResolver Version

5.5.1

.NET Version

.NET 6.0

Operating System

Windows

Describe the Bug

When using the SignatureComparer class to compare an imported type reference A with another imported type reference B that both resolve to the same type definition T, but A points directly to T while B indirectly references T via an exported type, the signature comparison fails even though the same types are effectively referenced.

Resolving before comparing does work.

How To Reproduce

using AsmResolver.DotNet;
using AsmResolver.DotNet.Signatures;

var module = ModuleDefinition.FromFile(typeof(Program).Assembly.Location);

var reference1 = KnownCorLibs.SystemPrivateCoreLib_v6_0_0_0
    .CreateTypeReference("System", "Exception")
    .ImportWith(module.DefaultImporter);

var reference2 = KnownCorLibs.SystemRuntime_v6_0_0_0
    .CreateTypeReference("System", "Exception")
    .ImportWith(module.DefaultImporter);

Console.WriteLine(SignatureComparer.Default.Equals(reference1, reference2));
Console.WriteLine(SignatureComparer.Default.Equals(reference1.Resolve(), reference2.Resolve()));

Expected Behavior

True
True

Actual Behavior

False
True

Additional Context

This is likely because of the recursion protection that is done in SignatureComparer::SimpleTypeEquals. Because the parent modules of both references are the same, the signature comparer does not try to resolve the references.

https://github.com/Washi1337/AsmResolver/blob/78ce89adebcc7e5dbeb1d4a1a35afcd513fb87a1/src/AsmResolver.DotNet/Signatures/SignatureComparer.TypeDefOrRef.cs#L63-L70

nike4613 commented 6 months ago

I'd think this is actually explicitly expected behavior, no? Type-forwards or no, as far as the module owning the reference is concerned, they're different types from different assemblies. At any rate, I don't think the default comparer should consider them equivalent. An option to consider them equivalent may make sense (though I'd propose it would only make sense in the context of a workspace or similar).

Strictly speaking, assemblies aren't supposed to reference System.Private.CoreLib, which is where I suspect this comes up most (per the given example). It may be helpful to include helpers to remap S.P.CoreLib references to the correct reference assemblies.

The conceptual distinction I think I'm making here is "building" versus "analyzing". When building, one should only ever reference System.Runtime and friends, never System.Private.CoreLib, and should make distinctions between references to different assemblies. When analyzing, one doesn't care about differences, and instead you care only about definition equality. That can also only make sense in a workspace though; consider:

Under non-workspace rules, the two Span<int> references resolve to different types, but from the context of wanting to analyze B, they should be the same. In the current APIs, this must be done with a workspace, where we define B to be the entry-point, and thus when resolving corelib types, they resolve from System.Private.CoreLib, Version=8.0.0, making the two references to (correctly) resolve to the same type.

Washi1337 commented 6 months ago

Strictly speaking, assemblies aren't supposed to reference System.Private.CoreLib, which is where I suspect this comes up most (per the given example). It may be helpful to include helpers to remap S.P.CoreLib references to the correct reference assemblies.

While that is true, what binaries are supposed to do and what could happen are two separate problems. While forwarded types currently only really exist in corlib-like assemblies, that does not exclude the possibility of this happening in the future for other DLLs, or even for (maliciously crafted) input binaries. Special-casing on corlib members is therefore more of a band-aid solution rather than solving the core problem.

The conceptual distinction I think I'm making here is "building" versus "analyzing". When building, one should only ever reference System.Runtime and friends, never System.Private.CoreLib, and should make distinctions between references to different assemblies. When analyzing, one doesn't care about differences, and instead you care only about definition equality. That can also only make sense in a workspace though...

I agree, but I would also argue the main purpose of SignatureComparer lies in analysis. For instance, it is crucial for the comparer to consider forwarded types when finding and/or resolving member references. As far as I can tell, constructing new references has no direct relation with the comparer.

nike4613 commented 6 months ago

I suppose my concern boils down to my own mental model of the metadata: there, System.Exception, System.Private.CoreLib and System.Exception, System.Runtime are different signatures. I just feel that it should be somehow explicit that the comparison is being made based on the resolved definition (which may then involve loading the referenced assemblies, and be subject to the resolution rules) rather than on the reference itself.

Special-casing on corlib members is therefore more of a band-aid solution rather than solving the core problem.

I'm not recommending special casing them. Rather, I'm suggesting a tool which could (for instance) take a collection of reference assemblies, and be used to map references to the underlying definition (like the def in System.Private.CoreLib) instead to the relevant reference assembly. (Then, presumably, provide one for S.P.CoreLib.)

I also wouldn't expect to see both a System.Exception, System.Runtime reference and a System.Exception, System.Private.CoreLib reference unless it was constructed using some very strange (and likely incorrect) methods.


Perhaps the solution I'm imagining is that everything related to detailed analysis, where you might care about runtime behavior, you load up a workspace (perhaps with a name reminiscent of the runtime's own AssemblyLoadContext?), load your assemblies in that, and use it as your center for comparisons like what you're talking about here.

Washi1337 commented 6 months ago

I suppose my concern boils down to my own mental model of the metadata: there, System.Exception, System.Private.CoreLib and System.Exception, System.Runtime are different signatures. I just feel that it should be somehow explicit that the comparison is being made based on the resolved definition (which may then involve loading the referenced assemblies, and be subject to the resolution rules) rather than on the reference itself.

This is a fair concern. We could add a flag to the SignatureComparisonFlags enumeration that distinguishes between the two modes. The question then becomes, which one should SignatureComparer.Default take? From a DX standpoint, I can see both having merit. An alternative to having one default is to split up SignatureComparer.Default into two default comparers: one that does not resolve and one that does.

I also wouldn't expect to see both a System.Exception, System.Runtime reference and a System.Exception, System.Private.CoreLib reference unless it was constructed using some very strange (and likely incorrect) methods.

True, but these comparisons happen more often than you may think. Member resolution heavily depends on this, in particular, the reading and writing of custom attribute signatures. These may contain enum values for which the underlying integer types needs to be resolved. Without considering forwarded types we would not be able to deserialize and serialize accurately here.

Perhaps the solution I'm imagining is that everything related to detailed analysis, where you might care about runtime behavior, you load up a workspace (perhaps with a name reminiscent of the runtime's own AssemblyLoadContext?), load your assemblies in that, and use it as your center for comparisons like what you're talking about here.

This is the exact direction #471 and #537 are taking and I plan on it being part of 6.0 as well.