Closed yowl closed 4 years ago
This is generic virtual method calls - similar to what was hit in CppCodegen mode here: #6147
The fix is more involved because it includes implementing shared generic code, but it will be probably pretty similar to the fix in CppCodegen: #6467. It's doable in a single pull request.
I've been meaning to write a doc on how shared generic code works. If you would like to work on this, I can sit down and write an overview doc (it might make the CppCodegen changes easier to parse).
Thanks, I'll update the title. I've a few other things to finish so will not look at it straight away. If no one else picks it up I'll have a go later, a few months probably.
This has become an itch that I have to scratch, so have started to look at the changes in #6467 and how they apply to wasm.
I'm enabling shared generics for wasm and have added a test copied from the main test suite, but before the compiler gets there, I'm hitting the assert at https://github.com/dotnet/corert/blob/24f99b7067f68c81b4a73b96bbaa2c24bba0c5d2/src/ILCompiler.Compiler/src/Compiler/DependencyAnalysis/GCStaticsNode.cs#L22 for the type {[S.P.CoreLib]System.Array`1+ArrayEnumerator<System.__Canon>}
. What is a GCStaticsNode
, why is it invalid for a subtype of a canonical type (if thats the right name), I had a look at the c++ implementation at https://github.com/dotnet/corert/blob/24f99b7067f68c81b4a73b96bbaa2c24bba0c5d2/src/ILCompiler.CppCodeGen/src/CppCodeGen/ILToCppImporter.cs#L3661-L3681 and it doesn't seem to do anything special to prevent hitting this assert as that conditional logic looks very similar to the wasm at https://github.com/dotnet/corert/blob/24f99b7067f68c81b4a73b96bbaa2c24bba0c5d2/src/ILCompiler.WebAssembly/src/CodeGen/ILToWebAssemblyImporter.cs#L3283-L3307
Thanks
Ignore the cpp comparison I see where that differs now. I'm still not clear though on what a GCStaticNode is, sounds like a static object that can be GC collected, which I can see would happen for a ThreadStatic when the Thread is finished, but are there other uses? And why doesn't each subtype of the canonical type get its own GCStaticNode, I suppose that is what I dont understand. Maybe I don't really need to in order to move the same logic from cpp to wasm.
Wasm currently doesn't use the GCStatics node, but maybe it should start doing that (might be a candidate for a separate smaller pull request).
The comment is referring to this code that I think we now run at startup in WASM:
GC statics (static fields for which HasGCStaticBase
reports true) are allocated on the GC heap and are accessed differently from normal statics.
What the InitializeStatics
method does is that it allocates a small object that holds data for all GC statics on the type. The compiler generates a fake EEType that has just enough information for the GC to work (it describes the offsets of GC pointers within the memory region that holds all GC static fields of the type). We ask the GC to allocate an instance of that type and store a GC handle to it in place.
All 3 categories of static fields (GC static, nonGC static, threadstatic) get laid out sequentially into the 3 bases.
class Foo
{
static int X;
static int Y;
static string Z;
}
To access field X, you would ask for the NonGC static base (because FieldDesc.HasGCStaticBase reports false for X) of class Foo and access offset 0. To access field Y, would would ask for the NonGC static base and access offset 4. WASM already does that.
To access field Z, you would ask for the GC static base, do an indirection (the GC static base symbol has a GC handle that needs to be dereferenced), and then access offset 0 (the field Z got laid out in a different base, so the offsets start at 0 again).
The reason why we disallow GC static bases for canonical types is because it doesn't make sense to have them.
There is never an instance of a canonical type allocated. Only code can be canonical.
The gist of shared generics is basically about finding a way to save space when generating generic code - try to find a way how we could share the same code for different instantions of the same generic type definition.
If I have a class with this method:
class Generic<T>
{
public void Frob(T x)
{
x.ToString();
}
}
And I would like to compile method Generic<object>.Frob()
and Generic<string>.Frob()
. The generated code for both of them would be this:
G_M37205_IG01:
4883EC28 sub rsp, 40
G_M37205_IG02:
488BCA mov rcx, rdx
488B02 mov rax, qword ptr [rdx]
FF5028 call gword ptr [rax+40]System.Object:ToString():ref:this
90 nop
G_M37205_IG03:
4883C428 add rsp, 40
C3 ret
It seems wasteful to generate this code twice, so we would like to find a way to share it. The easiest is to figure out the sharing scheme for all reference types. The reason why we restrict ourselves to reference types is basically that if T is substituted with a value type, a lot of thigs become different. E.g. here's Generic<double>.Frob()
G_M60901_IG01:
4883EC28 sub rsp, 40
F20F114C2438 movsd qword ptr [rsp+38H], xmm1
G_M60901_IG02:
488D4C2438 lea rcx, bword ptr [rsp+38H]
E800000000 call System.Double:ToString():ref:this
90 nop
G_M60901_IG03:
4883C428 add rsp, 40
C3 ret
As you can see, the immediate (huge) difference is that the calling convention is different - the argument was passed in the floating point xmm1 register instead of rdx.
But reference types are sufficiently the same - all of them are a single pointer, and they need to be reported to the GC.
We call code that can be shared by different instantiations "canonical code". The code is canonical because it's not tied to a specific T. The specific T can be subsituted at runtime and the code will do the right thing for it. In the type system, this is represented by a special type System.__Canon
that is a reference type and can stand in for any other reference type. We could use System.Object
, but then it would be difficult to figure out whether something is actually instantiated over object or represents this special thing that will only have an identity at runtime. Also, System.__Canon
can be made to meet all generic constraints. Object
wouldn't.
We could just use the same code for everything but the trouble is that whenever we use the T
, the code gets a tiny bit different:
class Generic<T>
{
static T s_field;
public void Frob(T x)
{
Generic<T>.s_field = x;
}
}
G_M37205_IG01:
57 push rdi
56 push rsi
4883EC28 sub rsp, 40
48894C2420 mov qword ptr [rsp+20H], rcx
488BF1 mov rsi, rcx
488BFA mov rdi, rdx
G_M37205_IG02:
// ***** Difference! - accessing static base of a particular instantiation *****
4800000000 lea eax, Static_Base_Of_Generic<object>
488D4808 lea rcx, bword ptr [rax+8]
488BD7 mov rdx, rdi
E800000000 call CORINFO_HELP_ASSIGN_REF
90 nop
G_M37205_IG03:
4883C428 add rsp, 40
5E pop rsi
5F pop rdi
C3 ret
As you can see, the static base we access will be different depending on whether we access Generic<object>
or Generic<string>
The way to fix this and make the method canonical again (without a specific identity at compile time) is by adding an indirection. Whenever the canonical code would like to do something with its T (be it a cast, static base access, generic method call, etc.), the codegen will perform an indirection instead of doing a direct access. The indirection will use the runtime identity of the type and substitute the T with the specific thing.
There are 3 ways to supply identity to canonical code:
this
, from which we can figure out the specific instance at runtime. (AcquiresInstMethodTableFromThis
)RequiresInstMethodDescArg
)RequiresInstMethodTableArg
)Once we have the identity (also called "generic context"), we can pass it to a helper to retrieve the right thing.
G_M37205_IG01:
57 push rdi
56 push rsi
4883EC28 sub rsp, 40
48894C2420 mov qword ptr [rsp+20H], rcx
488BF1 mov rsi, rcx
488BFA mov rdi, rdx
G_M37205_IG02:
// Loading `this` into RCX so that the helper can figure out what specific generic instantiation this is
488B0E mov rcx, qword ptr [rsi]
E800000000 call CORINFO_HELP_READYTORUN_GENERIC_STATIC_BASE
488D4808 lea rcx, bword ptr [rax+8]
488BD7 mov rdx, rdi
E800000000 call CORINFO_HELP_ASSIGN_REF
90 nop
G_M37205_IG03:
4883C428 add rsp, 40
5E pop rsi
5F pop rdi
C3 ret
If we didn't add this indirection, the canonical code would try to retrieve the static base of Generic<__Canon>
, but that's incorrect, because we need the static base of the concrete thing.
Static base of Generic<__Canon>
doesn't really exist.
Thanks for the clear and detailed responses 👍
Yes, indeed InitializeStatics
is run now in wasm, so I'll see if I can enable proper GCStatics instead of the globals that it does now.
I will need to extend https://github.com/dotnet/corert/blob/2b158f58b2c46f737370cf84d98c522216919c68/src/ILCompiler.Compiler/src/Compiler/DependencyAnalysis/ReadyToRunGenericHelperNode.cs#L16 or https://github.com/dotnet/corert/blob/2b158f58b2c46f737370cf84d98c522216919c68/src/ILCompiler.Compiler/src/Compiler/DependencyAnalysis/ReadyToRunGenericHelperNode.cs#L296 in order to write out the code to do the indirection, correct? I see that x64 does that.
The ReadyToRunGenericHelper node and it's descendants are a bit unfortunately tied to code generation backends where we can emit assembly helpers as bytes.
Eventually, we should probably factor them in a way that they can be reused. If you would like to try that, you can look into that (might be worth a separate pull request).
But I think the thing that would let you make progress the easiest it to just take the approach from CppCodegen - make a GetCodeForReadyToRunGenericHelper
equivalent, and use the HelperNode classes as they are (you would use them to track stuff in the dependency graph, but you wouldn't use their GetObjectData
to generate bytecode).
Would it be correct to store {[S.P.CoreLib]System.Collections.Generic.IEnumerable`1<System.Collections.Generic.KeyValuePair`2<System.__Canon,System.__Canon>>.GetEnumerator()}
in a WebAssemblyVTableSlotNode
? I'm wondering whether I need to delete the assert at https://github.com/dotnet/corert/blob/4a576fec44748fcb5b5b22486073aeab24c4ba7e/src/ILCompiler.WebAssembly/src/Compiler/DependencyAnalysis/WebAssemblyVTableSlotNode.cs#L20. The c++ code uses a VirtualMethodUseNode
for this which is why I'm asking.
I'm going to go ahead and remove it. If its wrong, hopefully the consequent failure will teach me something.
Yeah, the vtableslotnode was implemented by copying the interface dispatch cell node over and tweaking it.
It doesn't make sense to assert that for normal virtual methods. We shouldn't have copied that assert over - I missed that.
Thanks. I have a lot of questions on this, so sorry about that. Looking at the cpp implementation for ldftn
https://github.com/dotnet/corert/blob/1ead4eb2ce1538bd18e5390d00cc79bb4f2967fd/src/ILCompiler.CppCodeGen/src/CppCodeGen/ILToCppImporter.cs#L2289-L2316 there are 2 checks, one on the canonMethod
, and another on the method
. What scenario is the else handling at https://github.com/dotnet/corert/blob/1ead4eb2ce1538bd18e5390d00cc79bb4f2967fd/src/ILCompiler.CppCodeGen/src/CppCodeGen/ILToCppImporter.cs#L2311? Mabe where the method is shareable but the actual runtime method is not over a reference type so will not call into the shared code?
Actually think I'm wrong here, it happens compiling https://github.com/dotnet/corert/blob/1ead4eb2ce1538bd18e5390d00cc79bb4f2967fd/src/System.Private.CoreLib/shared/System/Runtime/CompilerServices/ConditionalWeakTable.cs#L223 and both the generic parameters have contraints of class
.
With the method
public interface IFace<T>
{
string IFaceGVMethod1<U>(T t, U u);
}
I'm hitting the other assert at https://github.com/dotnet/corert/blob/4a576fec44748fcb5b5b22486073aeab24c4ba7e/src/ILCompiler.WebAssembly/src/Compiler/DependencyAnalysis/WebAssemblyVTableSlotNode.cs#L21 . For this case the logic is saying that there is no need for a runtime lookup as method.HasInstantiation
is true
and method.IsSharedByGenericInstantiations
is false. So there's no runtime lookup and the previous code for creating a WebAssemblyVTableSlotNode
is called. Is that right? The implementation of the interface is AnotherDerivedClass
public class AnotherBaseClass<T>
{
public virtual string IFaceMethod1(T t) { return "AnotherBaseClass.IFaceMethod1"; }
public virtual string IFaceGVMethod1<U>(T t, U u) { return "AnotherBaseClass.IFaceGVMethod1"; }
}
public class AnotherDerivedClass<T> : AnotherBaseClass<T>, IFace<T>
{
}
which I'd have thought would be shared for the invocation new AnotherDerivedClass<string>()).IFaceGVMethod1<string>("string1", "string2");
being as the generic parameters are strings.
So I suppose there's 2 questions here. Why is this not shared? and assuming its not, is the Assert valid?
Thanks,
Generic virtual methods are not invoked through the vtable - they have a separate table and there's some complex code in the compiler to emit it and in the class library to parse it - in a codegen backend, you don't have to worry about that, fortunately.
Instead of using the WebAssemblyVTableSlotNode cell, you'll need to notify the generic virtual dependency tracking within the compiler that there's a generic virtual method call:
(This makes sure the table the compiler generates has the proper entries, plus we generate any potential overrides.)
To generate the actual call, you'll call a helper that does the complex parsing:
The helper will either give you a callable method pointer, or a fat function pointer. Fat function pointer is a small structure that has two things: function pointer, and the generic context. Generic context is the thing that needs to be passed as the first parameter to the called method (the called method has an extra parameter that is not visible in the normal signature).
The CppCodegen code I'm linking has the call to the helper to do the resolution, and also does the check whether the returned pointer is a fat function pointer.
I'm plodding through this and have a question on unboxing thunks. Comparing to the cpp backend, at https://github.com/dotnet/corert/blob/496b9b731dde78821f66345d6ee72896d0453d2c/src/ILCompiler.CppCodeGen/src/Compiler/DependencyAnalysis/CppCodegenNodeFactory.cs#L51 there is nothing much special going on, it just passes the method for the stub directly in to the node.
For Wasm, at https://github.com/dotnet/corert/blob/496b9b731dde78821f66345d6ee72896d0453d2c/src/ILCompiler.WebAssembly/src/Compiler/DependencyAnalysis/WebAssemblyCodegenNodeFactory.cs#L54 it makes a call to GetUnboxingThunk
and this is giving me a problem later on as I try to create a ReadyToRunHelperFromTypeLookup
for it. Why is Wasm different for these thunks? And the only use of GetUnboxingThunk
seems to be in the Wasm backend, could it be that its not required when we enable shared generics for Wasm?
Thanks,
After a bit more investigation I suppose this is partly due to the difference in GetStaticDependencies
between the unboxing nodes for cpp and Wasm, I'll try to bring them in line.
If you're seeing issues for generic structs, the trick is not to use the normal unboxing thunk for those:
They need to use a "special" unboxing thunk because we need to inject the generic context argument. This describes what they do:
This is the other place where the switcheroo needs to happen:
Can you tell me where I can find the code for set_EEType
: https://github.com/dotnet/corert/blob/005003d0da7dcbb7cad1b71b0553ab34f9802a05/src/Native/Runtime/portable.cpp#L74 ? I'm having some trouble understanding how to check that I'm getting the right generic context when it comes from the instance, i.e. this. I believe it should just dereference this
and that gives a pointer to the method table but I wanted to see how that was set up.
Yes, it getting to the EEType is just about dereferencing this
.
set_EEType
is defined in ObjectLayout.h.
Thanks, I've made some progress but have a question on calling shared class constructors that need a hidden context parameter, e.g. https://github.com/dotnet/corert/blob/master/src/Common/src/System/Collections/Generic/LowLevelList.cs
Looking at the cpp backend output for ClassConstructorRunner.Call
It has some code to detect for fat function pointers:
void System_Private_CoreLib::System::Runtime::CompilerServices::ClassConstructorRunner::Call(intptr_t pfn)
{
_bb0: {
intptr_t _1 = pfn;
intptr_t _2 = _1;
if (_2 & 2) {
_1 = *(intptr_t*)(_2 - 2);
typedef void(*__calli__11000001)(void*);
((__calli__11000001)_1)(**(void***)(_2 - 2 + sizeof(void*)));
} else {
typedef void(*__calli__11000001)();
((__calli__11000001)_1)();
};
return;
}
}
Which I expect I'm going to need, so I guess now is the time to handle fat function pointers on the implementation of calli
? And on that subject, they seem to be written incorrectly at present as shouldn't they be written setting the 2nd least significant bit (i.e. +2) to the function address which doesn't appear to be happening right now:
@"__fatpointer_S_P_Reflection_Core_System_Collections_Generic_LowLevelList_1<String>___cctor" = global [2 x i32*] [i32* bitcast (void (i8*, i32*)* @"S_P_Reflection_Core_System_Collections_Generic_LowLevelList_1<System___Canon>___cctor" to i32*), i32* bitcast ([1 x i32*]* @"__indirection__EEType_S_P_Reflection_Core_System_Collections_Generic_LowLevelList_1<String>_0" to i32*)]
Which I expect I'm going to need, so I guess now is the time to handle fat function pointers on the implementation of calli
Yes, class constructors are invoked by calli
and being able to handle fat pointers is important for that because the .cctor takes the hidden parameter as the first argument
And on that subject, they seem to be written incorrectly at present as shouldn't they be written setting the 2nd least significant bit (i.e. +2) to the function address which doesn't appear to be happening right now
That will probably be a bug somewhere in the Wasm object writer. Fat function pointer object nodes are odd:
The way this works is that we define the symbol for the ObjectNode at offset 0 of the generated data blob, but when we're generating a reference to the ObjectNode, it goes to Symbol+2.
This is similar to how EETypeNode does this, but it's inverted:
The reason why it's inverted is because the LLVM-based object writer is not able to define a symbol in the middle of a reloc and that's what we would need to do for the fat pointer case. You might be able to also just make the problem go away by doing int ISymbolDefinitionNode.Offset => Method.Context.Target.Architecture == TargetArchitecture.Wasm ? FatFunctionPointerConstants.Offset : 0;
(and do the same for ISymbolNode.Offset
).
This could be a problem as I believe emscripten (or maybe its Wasm) does function pointers as indicies into a table so they are not aligned anyway making it difficult to detect what is a fat function pointer and what is not. I might try a hack and set the MSB instead.
The low bit is nice because on non-abstract platforms we can guarantee it's never set. On abstract platforms like WASM, we might need to do something else. AFAIK none of the bits are under our control.
The reason why we have fat pointers is to be able to avoid runtime codegen. CoreCLR doesn't have fat pointers - it just generates what they call an "instantiating stub", which adds the first (hidden) argument with the generic context and moves all arguments (so that arg0 is arg1, arg1 is arg2, etc.). What is a "fat pointer" in CoreRT is just this small piece of code in CoreCLR and one calls it like a normal function pointer.
We don't have these instantiating stubs in CoreRT because emitting the instantiating stub requires a JIT when reflection is involved (e.g. if I have code for List<__Canon>
and someone does typeof(List).MakeGenericType(typeof(string))
at runtime, we can satisfy that request because we have all the code for this - we just build a couple data structures describing the type - if we had to make instantiating stubs, we wouldn't be able to satisfy that request because we might need to JIT up new instantiating stubs that pass the new hidden argument). Fat pointers solve that problem because fat pointers are just a data structure and we can make that.
If you create an instantiating stub instead of the fat pointer, it should unblock you for now. But we'll hit this reflection problem eventually. I can't think of a good solution right now, but calling conventions in WASM are less complex than on real hardware, so we might be able to come up with some other alternative to fat pointers.
It seems to get quite far using a high bit. It doesn't seem to be a restriction on the number of methods, you can only have 2^30 with a 32 bit address space (if I'm thinking right and assuming no space for anything other than the method table) so for abstract platforms maybe it would be ok to #if that? With this change it will run the test program, and get pass the runtime initialisation and some tests up to Method.Invoke with a reference parameter. So that's exercising at least some of the shared generic code. I have some instrumentation in TypeLoaderEnvironment
(ignore the bogus int/uint naming)
public bool CompareMethodSignatures(RuntimeSignature signature1, RuntimeSignature signature2)
{
if (signature1.IsNativeLayoutSignature && signature2.IsNativeLayoutSignature)
{
if(signature1.StructuralEquals(signature2))
return true;
X2.PrintLine("CompareMethodSignatures");
X2.PrintUint((int)signature1.NativeLayoutOffset);
X2.PrintUint((int)signature2.NativeLayoutOffset);
And when it fails this prints
CompareMethodSignatures
002000BB
000000BB
It could be coincidence that these are just 1 bit different, or is there some flag getting set perhaps? I'm just asking in case something jumps out, it's probably my bug somewhere. I notice that the call stack contains
_S_P_CoreLib_Internal_TypeSystem_LockFreeReaderHashtable_2_S_P_TypeLoader_Internal_TypeSystem_TypeSystemContext_RuntimeMethodKey__System___Canon___TryGetValue@http://localhost:6931/HelloWasm.wasm:wasm-function[13438]:0x98e8c4
_S_P_CoreLib_Internal_TypeSystem_LockFreeReaderHashtable_2_S_P_TypeLoader_Internal_TypeSystem_TypeSystemContext_RuntimeMethodKey__System___Canon___GetOrCreateValue@http://localhost:6931/HelloWasm.wasm:wasm-function[12773]:0x8f404d
_S_P_TypeLoader_Internal_TypeSystem_TypeSystemContext__ResolveRuntimeMethod@http://localhost:6931/HelloWasm.wasm:wasm-function[12116]:0x87eea5
_S_P_TypeLoader_Internal_TypeSystem_TypeSystemContext__ResolveGenericMethodInstantiation@http://localhost:6931/HelloWasm.wasm:wasm-function[12086]:0x87ac65
_S_P_TypeLoader_Internal_Runtime_TypeLoader_NativeLayoutInfoLoadContext__GetMethod@http://localhost:6931/HelloWasm.wasm:wasm-function[11201]:0x7ccc73
_S_P_TypeLoader_Internal_Runtime_TypeLoader_NativeLayoutInfoLoadContext__GetMethod_0@http://localhost:6931/HelloWasm.wasm:wasm-function[11223]:0x7cd49f
_S_P_TypeLoader_Internal_Runtime_TypeLoader_TemplateLocator__TryGetGenericMethodTemplate_Internal@http://localhost:6931/HelloWasm.wasm:wasm-function[14125]:0x9ec9f5
_S_P_TypeLoader_Internal_Runtime_TypeLoader_TemplateLocator__TryGetGenericMethodTemplate@http://localhost:6931/HelloWasm.wasm:wasm-function[13409]:0x98a789
_S_P_TypeLoader_Internal_Runtime_TypeLoader_TypeBuilder__ParseNativeLayoutInfo@http://localhost:6931/HelloWasm.wasm:wasm-function[12817]:0x8fb5df
_S_P_TypeLoader_Internal_Runtime_TypeLoader_TypeBuilder__PrepareMethod@http://localhost:6931/HelloWasm.wasm:wasm-function[12048]:0x871091
_S_P_TypeLoader_Internal_Runtime_TypeLoader_TypeBuilder__BuildMethod@http://localhost:6931/HelloWasm.wasm:wasm-function[16615]:0xb23dde
_S_P_TypeLoader_Internal_Runtime_TypeLoader_TypeBuilder__TryBuildGenericMethod_0@http://localhost:6931/HelloWasm.wasm:wasm-function[15953]:0xacc1ca
_S_P_TypeLoader_Internal_Runtime_TypeLoader_TypeBuilder__TryBuildGenericMethod@http://localhost:6931/HelloWasm.wasm:wasm-function[15810]:0xab5bd0
_S_P_TypeLoader_Internal_Runtime_TypeLoader_TypeLoaderEnvironment__TryGetGenericMethodDictionaryForComponents@http://localhost:6931/HelloWasm.wasm:wasm-function[14451]:0xa14670
_S_P_Reflection_Execution_Internal_Reflection_Execution_ExecutionEnvironmentImplementation__GetDynamicMethodInvokeMethodInfo@http://localhost:6931/HelloWasm.wasm:wasm-function[14221]:0x9fa863
_S_P_Reflection_Execution_Internal_Reflection_Execution_ExecutionEnvironmentImplementation__TryGetMethodInvokeInfo@http://localhost:6931/HelloWasm.wasm:wasm-function[13280]:0x96915e
So it does seem that this might be some code that is only now getting exercised by Wasm as its going through https://github.com/dotnet/corert/blob/946883b1fd62ab972ee83d484503e2fa9b4e2f7a/src/System.Private.TypeLoader/src/Internal/Runtime/TypeLoader/NativeLayoutInfoLoadContext.cs#L186-L191 for the method InvokeRetOII
, a public partial class DynamicInvokeMethodThunk : ILStubMethod
?
Nothing jumps out on me for the extra bit - I'm not aware of any, but I'm also not up to speed with this part of the codebase.
_S_P_TypeLoader_Internal_Runtime_TypeLoader_TypeLoaderEnvironment__TryGetGenericMethodDictionaryForComponents
will definitely be new. Reflection method invoke needs to basically do a calling convention conversion: MethodBase.Invoke
takes an object[]
of parameters, but the called method expects the parameters to be laid out on the stack as dictated by the calling convention. This conversion is done using the InvokeRetXXX
methods - there's a bunch of infrastructure around it, but basically the OII
part of the name says that it's going to invoke method that returns non-void (object), and takes two parameters by value ("in" parameters). This method is generic (so that it can be reused for different kinds of signatures), but we don't keep the generic dictionaries in the executable to save space. You're now at a point where we're trying to build the generic dictionary at runtime. This code path wasn't active before because even though InvokeRetOII was there, it wasn't a shared method body (no shared generics) and we didn't need to build a generic dictionary for it.
Current tests are running, but I've still a bit to do. Delegate.Invoke for example. This does not seem to have the necessary code as it looks like it needs to do a lookup ,cpp does this through a call to e.g.
int32_t _12 = (*::HelloWasm::Stack_1_StackDelegate_A__System___Canon_V_::__invoke__Invoke(_9))((::HelloWasm::Stack_1_StackDelegate_A__System___Canon_V_*)((::System_Private_CoreLib::System::MulticastDelegate*)_9)->m_firstParameter, (__Array_A_::System::__Canon_V_*)_11);
which Wasm is not doing. Wasm has support for callvirt through VTableSlot (if I read it right). My question is should callvirt
for Delegate Invoke be special cased like it is for cpp, or should it also go through VTableSlot nodes?
IL that I'm compiling:
.method public hidebysig instance bool CallDelegate(class Stack`1/StackDelegate<!T> d) cil managed
{
// Code size 51 (0x33)
.maxstack 2
.locals init (int32 V_0,
bool V_1)
IL_0000: nop
IL_0001: ldstr "CallDelegate"
IL_0006: call void Program::PrintLine(string)
IL_000b: nop
IL_000c: ldarg.0
IL_000d: ldfld !0[] class Stack`1<!T>::items
IL_0012: ldlen
IL_0013: conv.i4
IL_0014: stloc.0
IL_0015: ldloca.s V_0
IL_0017: call instance string [System.Runtime]System.Int32::ToString()
IL_001c: call void Program::PrintLine(string)
IL_0021: nop
IL_0022: ldarg.1
IL_0023: ldarg.0
IL_0024: ldfld !0[] class Stack`1<!T>::items
IL_0029: callvirt instance bool class Stack`1/StackDelegate<!T>::Invoke(!0[])
IL_002e: stloc.1
IL_002f: br.s IL_0031
IL_0031: ldloc.1
IL_0032: ret
} // end of method Stack`1::CallDelegate
Delegate call at 0029.
I think I can answer this myself. It needs a special case as it needs a function call rather than just a vtable lookup to get the function pointer for the invoke method.
I think that technically doing a normal call to Invoke
should work too (it's what happens when the Invoke method is reflection-invoked), but codegens special case delegate invocation because it's faster.
I see, that's what it was doing but I was having a problem whereby the delegate was static but it was getting an extra first parameter passed on the stack, probably some attempt to pass this
. Is that because I've messed up
in the delegate construction?
The target method of the delegate might require a runtime lookup (i.e. we need to look up the target of the delegate in a generic dictionary). To check for this, one would inspect the NeedsRuntimeLookup
property of the delegate creation info returned from GetDelegateCtor
.
I don't actually know whether there's test coverage to make sure the Invoke
method body works. It could also be a bug there (in the RyuJIT case the actual method body of the delegate Invoke
method can only be reached if one either reflection-calls the invoke method, or constructs a delegate that points to another delegate's Invoke
method).
Its most likely the generic dictionary lookup that I've got wrong as that's what I'm adding. There are a few other tests on instance and static delegates that use ..Invoke
, so that's probably ok. I have added code for NeedsRuntimeLookup
but must have something wrong there so I'll look at that closer. Thanks
Sorry if I'm being a bit thick here. Reading your reply on gitter "invoking the delegate is always just about invoking the function pointer (with a normal managed instance method calling convention) at m_functionPointer and passing m_firstParameter as this" and looking at
https://github.com/dotnet/corert/blob/635cf21aca11265ded9d78d216424bd609c052f5/src/System.Private.CoreLib/src/System/Delegate.cs#L279-L285
It seems that for a static delegate the first parameter will be the Delegate
object as that is what is set to m_firstParameter
, so in the generation of the code for the delegate target it needs to be aware that there is an additional parameter passed that would not be there in the case of a non-delegate static method. Is that right?
The m_functionPointer = GetThunk(OpenStaticThunk);
line accomplishes that. The GetThunk call will return a pointer to a method that the compiler generated that shuffles the arguments. The returned method is an instance method with the same arguments as the delegate Invoke
method. It loads all arguments except for this
, grabs the function pointer to call from m_extraFuctionPointerOrData
and performs an indirect call to it.
So when we create the delegate, we need to know which Initialize*
method to call to initialize it. But when we invoke the delegate, we just load m_firstParameter
, load all the arguments that Invoke
method has, and perform a calli to m_functionPointer
. The Initialize*
methods made sure the right thing will happen no matter if this is a delegate to an instance method, static method, closed/open, multicast (multiple delegate combined with the Combine
operator), etc.
The synthetic method for the open static case is generated here if you're interested in the IL:
The rest of the thunks that handle the other cases are in the same file.
(Should also point out that the Delegate.GetThunk
method is a bit magic - the compiler injects an override of this method into each delegate type.)
So when execution gets to the actual user function, there should be a stack frame which is the thunk? I'm not seeing that in the stacktrace, so that could be my problem. Edit: Seems to be my problem, GetThunk doesn't seem to be getting called. Thanks for the info, I now have something to chase.
@MichalStrehovsky Thanks for all the help, the first shared delegate test is now also working, finally. There's a lot of code paths with this, I wonder if it would be a good idea to use
instead of writing a bunch my own tests?
Awesome! Sorry I didn't respond earlier. I was sick in bed and there's only so much typing I'm willing to do on a phone.
We'll probably want to run all tests on WASM too, eventually. For now, maybe some level of ifdeffing would work? We already have some #if !CODEGEN_CPP
in this file.
Have started with ifdefing. These tests are much better than the ones in HelloWasm and now the path to perform the runtime lookups of generic virtual methods are being exercised for the first time, i.e.
https://github.com/dotnet/corert/blob/de9cdd9bd8c2944a629de559f30de399972639d3/tests/src/Simple/Generics/Generics.cs#L1300
This seems to be going down the right path but I'm not convinced I'm getting back the right function pointer as the resulting indirect call is failing. I've started to instrument the path and I can see that in GVMLookupForSlotWorker
at https://github.com/dotnet/corert/blob/de9cdd9bd8c2944a629de559f30de399972639d3/src/System.Private.CoreLib/src/Internal/Runtime/CompilerServices/GenericVirtualMethodSupport.cs#L34 I do have the right method name so looks like the RuntimeMethodHandle (I'm guessing that's where it comes from) is good. There's a bunch of abstract methods in TypeLoaderCallbacks
:
public abstract bool TryGetConstructedGenericTypeForComponents(RuntimeTypeHandle genericTypeDefinitionHandle, RuntimeTypeHandle[] genericTypeArgumentHandles, out RuntimeTypeHandle runtimeTypeHandle);
public abstract int GetThreadStaticsSizeForDynamicType(int index, out int numTlsCells);
public abstract IntPtr GenericLookupFromContextAndSignature(IntPtr context, IntPtr signature, out IntPtr auxResult);
public abstract bool GetRuntimeMethodHandleComponents(RuntimeMethodHandle runtimeMethodHandle, out RuntimeTypeHandle declaringTypeHandle, out MethodNameAndSignature nameAndSignature, out RuntimeTypeHandle[] genericMethodArgs);
public abstract bool CompareMethodSignatures(RuntimeSignature signature1, RuntimeSignature signature2);
public abstract IntPtr GetDelegateThunk(Delegate delegateObject, int thunkKind);
public abstract IntPtr TryGetDefaultConstructorForType(RuntimeTypeHandle runtimeTypeHandle);
public abstract bool TryGetGenericVirtualTargetForTypeAndSlot(RuntimeTypeHandle targetHandle, ref RuntimeTypeHandle declaringType, RuntimeTypeHandle[] genericArguments, ref string methodName, ref RuntimeSignature methodSignature, out IntPtr methodPointer, out IntPtr dictionaryPointer, out bool slotUpdated);
public abstract bool GetRuntimeFieldHandleComponents(RuntimeFieldHandle runtimeFieldHandle, out RuntimeTypeHandle declaringTypeHandle, out string fieldName);
public abstract bool TryGetPointerTypeForTargetType(RuntimeTypeHandle pointeeTypeHandle, out RuntimeTypeHandle pointerTypeHandle);
public abstract bool TryGetArrayTypeForElementType(RuntimeTypeHandle elementTypeHandle, bool isMdArray, int rank, out RuntimeTypeHandle arrayTypeHandle);
public abstract IntPtr UpdateFloatingDictionary(IntPtr context, IntPtr dictionaryPtr);
Where are these implemented? My idea is print out the function pointer in TryGetGenericVirtualTargetForTypeAndSlot
and compare it with the LLVM symbol for the actual function.
Found it, so will keep digging.
Will this work without the SUPPORTS_NATIVE_METADATA_TYPE_LOADING
macro? I seem to have that false at the moment.
Dont need that at least for the first test, will try GVM_RESOLUTION_TRACE to see what that says.
For the unboxing thunks at https://github.com/dotnet/corert/blob/05a20d39d85840454652eefc3d42b246d26eb9fe/src/ILCompiler.WebAssembly/src/Compiler/DependencyAnalysis/WebAssemblyCodegenNodeFactory.cs#L52-L54 should I be following the pattern: https://github.com/dotnet/corert/blob/05a20d39d85840454652eefc3d42b246d26eb9fe/src/ILCompiler.RyuJit/src/Compiler/DependencyAnalysis/RyuJitNodeFactory.cs#L57-L64
should I be following the pattern
Yes, with shared generics enabled, references to shared non-generic instance methods require the special stub that injects the instantiation argument. The rest continue using the "plain" unboxing stub that's there right now.
Having a problem with this in that the method with the instantiation argument is not getting compiled. As an example, for {[S.P.CoreLib]System.Collections.Generic.KeyValuePair``2<System.__Canon,System.__Canon>.ToString(native int)}
it appears as a deferred static dependency at https://github.com/dotnet/corert/blob/5575cc060201ad462809f48db25a56c35e4eaca0/src/ILCompiler.DependencyAnalysisFramework/src/DependencyAnalyzer.cs#L263-L267 but its not compiled as it has no IL. I guess this shouldn't happen, but instead I should get a compile call for the ToString()
where the hidden param will get added as it would for a non-value type shared generic?
Or maybe I should be handling this in the call from the Unbox
method to the method with the instantiation argument. I'd need a way to go from that method to the ToString()
.
When attempting to build an Uno Platform project, it fails on a method
in the ctor for
WebAssemblyVTableSlotNode
on the assertNot done any more investigation yet.