dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.19k stars 4.72k forks source link

Is this MethodImpl table behaving as intended? #11881

Closed lambdageek closed 4 years ago

lambdageek commented 5 years ago

This was brought up as an issue for Mono, and I was curious if the .NET Core behavior here is surprising to you, or if everything is working as intended.

Reproduction steps

  1. Save the following as override.il
    
    .assembly TestAssembly { }

.class public A extends [mscorlib]System.Object { .method virtual public instance void Foo() { ldstr "A::Foo" call void [mscorlib]System.Console::WriteLine(string) ret }

.method virtual public instance void Bar()
{
    ldstr "A::Bar"
    call void [mscorlib]System.Console::WriteLine(string)
    ret
}

.method public rtspecialname instance void .ctor()
{
    ldarg.0
    call instance void [mscorlib]System.Object::.ctor()
    ret
}

}

.class public B extends A { .method virtual public instance void Foo() { .override A::Bar ldstr "B::Foo" call void [mscorlib]System.Console::WriteLine(string) ret }

.method virtual public instance void Bar()
{
    .override A::Foo
    ldstr "B::Bar"
    call void [mscorlib]System.Console::WriteLine(string)
    ret
}

.method public rtspecialname instance void .ctor()
{
    ldarg.0
    call instance void A::.ctor()
    ret
}

}

.method static void Main() { .entrypoint .locals init ([0] class A a) newobj instance void B::.ctor() stloc.0 ldloc.0 callvirt instance void A::Foo() ldloc.0 callvirt instance void A::Bar() ret }

and `override.runtimeconfig.json`

{ "runtimeOptions": { "tfm": "netcoreapp2.1", "framework": { "name": "Microsoft.NETCore.App", "version": "2.1.6" } } }

2. `ilasm override.il`  to produce `override.exe`
3. `dotnet ./override.exe`

## Actual behavior ##

The following is the output:

B::Foo B::Foo



## Expected behavior ##

Not sure.  Maybe the actual behavior is correct?

## More details ##

ECMA-335 II.15.1.4 states:

>A MethodImpl, or method implementation, supplies the executable body for an existing virtual method.
It associates a Method (representing the body) with a MethodDecl or Method (representing the virtual
method). A MethodImpl is used to provide an implementation for an inherited virtual method or a
virtual method from an interface when the default mechanism (matching by name and signature) would
not provide the correct result.

So one interpretation of this is that `.override` is expected only if the normal name and signature overriding is not useful, and so it merely applies another mechanism to fill in a vtable slot.  Another interpretation is that `.override` may be used to sidestep the name and signature mechanism and so therefore it's used _instead_. 

With the former interpretation, the .NET Core (and .NET Framework >= 2.0) behavior makes sense although the result is unexpected - `B::Foo` appears in two different vtable slots: 
1. the MethodDef table has `A::Foo` first, then `A::Bar`, so presumably the vtable slots are assigned in that same order let's say slot 1 is for `A::Foo` and slot 2 is for `A::Bar`.
2. In B's vtable, slot 1 is empty so it gets `B::Foo` by name and signature matching.  And there's a .override for `B::Foo` so slot 2 also gets `B::Foo`.  Now slot 2 is no empty, so no action is taken.  So the final vtable for `B` has `B::Foo` in slots 1 and 2.

By contrast Mono takes the interpretation that we shouldn't even do name+signature matching if there's a .override.  With that interpretation:
1. the MethodDef table has `A::Foo` first, then `A::Bar`, so presumably the vtable slots are assigned in that same order let's say slot 1 is for `A::Foo` and slot 2 is for `A::Bar`.
2. in B's vtable, slot 1 is empty and there's an override for `A::Foo` so we get `B::Bar` in slot 1.  Slot 2 is empty and there's an override for `A::Bar`, so we get `B::Foo` in slot 2.  So the final vtable is `B::Bar` in slot 1 and `B::Foo` in slot 2.

Ultimately, Mono will follow the .NET Core / .NET Framework behavior.  But as I said at the outset - I'm curious if the .NET Core behavior is an intentional choice or an unexpected side-effect.

HT to @mletterle for bringing mono/mono#12582 to our attention.
marek-safar commented 5 years ago

/cc @MichalStrehovsky

MichalStrehovsky commented 5 years ago

I haven't debugged through it, but the first thing I noticed is there's a conflicting definition of overrides for the Foo and Bar methods. One definition using name matches, and one using overrides. If you try to make such conflict using other means (e.g. two conflicting .overrides), you would get a TypeLoadException. Maybe this should have been a TypeLoadException, but at this point there's probably obfuscators relying on this behavior (obfuscators pretty much rely on every virtual resolution corner case that exists).

It's clear there's a conflict because if you reorder Foo and Bar in A, you get different result.

WhatI think is happening is that we do a slot unification in B for both Foo and Bar (slot unification is described in II.10.3.4 Impact of overrides on derived classes of the spec). There's no more (logical) slot 2 in B, just one slot.

The slot unification is observable here (I removed the conflict to make it simpler to read):

.assembly TestAssembly { }

.class public A
    extends [mscorlib]System.Object
{
    .method virtual public instance void Foo()
    {
        ldstr "A::Foo"
        call void [mscorlib]System.Console::WriteLine(string)
        ret
    }

    .method virtual public instance void Bar()
    {
        ldstr "A::Bar"
        call void [mscorlib]System.Console::WriteLine(string)
        ret
    }

    .method public rtspecialname instance void .ctor()
    {
        ldarg.0
        call instance void [mscorlib]System.Object::.ctor()
        ret
    }
}

.class public B
    extends A 
{
    .method virtual public instance void Bar()
    {
        .override A::Foo
        ldstr "B::Bar"
        call void [mscorlib]System.Console::WriteLine(string)
        ret
    }

    .method public rtspecialname instance void .ctor()
    {
        ldarg.0
        call instance void A::.ctor()
        ret
    }
}

.class public C
    extends B 
{
    .method virtual public instance void Bar()
    {
        ldstr "C::Bar"
        call void [mscorlib]System.Console::WriteLine(string)
        ret
    }

    .method public rtspecialname instance void .ctor()
    {
        ldarg.0
        call instance void B::.ctor()
        ret
    }
}

.method static void Main()
{
    .entrypoint
    .locals init ([0] class A a)
    newobj instance void C::.ctor()
    stloc.0
    ldloc.0
    callvirt instance void A::Foo()
    ldloc.0
    callvirt instance void A::Bar()
    ret
}

This will print:

C::Bar
C::Bar
mletterle commented 5 years ago

If you try to make such conflict using other means (e.g. two conflicting .overrides), you would get a TypeLoadException. Maybe this should have been a TypeLoadException

As an aside, this was a TypeLoadException in .NET 1.0/1.1, it appears the behavior changed in .NET 2.0, which makes me curious if the change was related to the introduction of generics support somehow.

MichalStrehovsky commented 5 years ago

We could make this throw again, but this would likely break an obfuscator or two. Not sure if there's anything else we can do about this.