dotnet / runtimelab

This repo is for experimentation and exploring new ideas that may or may not make it into the main dotnet/runtime repo.
MIT License
1.42k stars 198 forks source link

[NativeAOT-LLVM] [Question] Should the generic ctx arg be a native int? #1875

Closed yowl closed 2 years ago

yowl commented 2 years ago

In the IL backend the generic context arg is passed as an i8*. THe IR for <Boxed>S_P_CoreLib_System_ValueTuple_2<System___Canon__System___Canon>__<unbox>S_P_CoreLib_System_ValueTuple_2__ToStrin looks like

------------ BB02 [000..012) (return), preds={BB01} succs={}
               [000020] ------------                 IL_OFFSET void   IL offset: 0x0
N003 (  3,  2) [000011] ------------        t11 =    LCL_VAR   ref    V00 this
                                                  /--*  t11    ref
N004 (  4,  3) [000012] ---X---N----              *  NULLCHECK byte
N005 (  3,  2) [000013] ------------        t13 =    LCL_VAR   ref    V00 this
N006 (  1,  1) [000014] ------------        t14 =    CNS_INT   int    4 field offset Fseq[Data]
                                                  /--*  t13    ref
                                                  +--*  t14    int
N007 (  5,  4) [000015] ------------        t15 = *  ADD       byref
N009 (  3,  2) [000004] ------------         t4 =    LCL_VAR   ref    V00 this          Zero Fseq[m_pEEType]
                                                  /--*  t4     ref
N010 (  6,  4) [000005] ---XG-------         t5 = *  IND       int
                                                  /--*  t15    byref  this in rcx
                                                  +--*  t5     int    arg1 in rdx
N011 ( 29, 16) [000006] --CXG-------         t6 = *  CALL      ref    System.ValueTuple`2.ToString

t5 is an int and I'm not sure its easy to identify as the generic context. So we end up with signature mismatches. Should we change the generic context in the IL backend to a native int i32 for Wasm32? I don't know what else we can do.

cc @SingleAccretion

SingleAccretion commented 2 years ago

Hmm, the context is indirected off, making it an integer would mean new inttoptrs (there are present in call arguments today, but could be eliminated if we retyped the IND to something like TYP_POINTER).

Am I right in understanding that the problem here is the fact it is not easy to identify the context in the signature?

yowl commented 2 years ago

Yes, so the lowered call is

                                                 /--*  t5     int
               [000047] ---XG-------        t47 = *  PUTARG_TYPE int
                                                  /--*  t34    int    arg-1
                                                  +--*  t42    int    arg-2
                                                  +--*  t47    int    arg1
N011 ( 29, 16) [000006] --CXG-------              *  CALL      void   System.ValueTuple`2.ToString

If we could identify the context, we could change the PUTARG_TYPE

SingleAccretion commented 2 years ago

So, I looked around (impImportCall and lvaInitTypeRef), and came to the conclusion that it would be ok to detect the context from its position, it comes after this and before user arguments (on non-x86).

CORINFO_SIG_INFO*       calleeSigInfo = callNode->callSig;
CORINFO_ARG_LIST_HANDLE sigArgs       = calleeSigInfo->args;
unsigned                argIx         = 0;

// We'll rely on the fact all arguments not in the signature come before those that are.
unsigned knownArgIx = 0;
unsigned thisArgIx  = UINT32_MAX;
unsigned ctxArgIx   = UINT32_MAX;

if (calleeSigInfo->hasThis()) // May need a (callThisArg != nullptr) check as well
{
    thisArgIx = knownArgIx++;
}
if (calleeSigInfo->hasTypeArg())
{
    ctxArgIx = knownArgIx++;
}

// Should be replaced with "if (assert does not hold) { /* Not yet implemented */ }".
// Though I don't think we'll have many other non-sig args, perhaps VSD...?
assert((knownArgIx + calleeSigInfo->numArgs) == argCount);
yowl commented 2 years ago

Neither the caller nor the callee here get CORINFO_CALLCONV_PARAMTYPE. Is it because they are unboxing ? I.e. in https://github.com/dotnet/runtimelab/blob/77d4b8cb192e34289555e4f9eff48e20e16335f2/src/coreclr/tools/Common/JitInterface/CorInfoImpl.cs#L648 It returns false, and then hasTypeArg() is false.

I guess I can try adding a call to TypeSystemContext.IsSpecialUnboxingThunkTargetMethod

SingleAccretion commented 2 years ago

Hmm, special intrinsics that do not fit into the model are indeed problematic.

Perhaps fixing the place that creates the mismatches would be a good solution:

// CompilerTypeSystemContext.BoxedTypes.cs

// Shared instance methods on generic valuetypes have a hidden parameter with the generic context.
// We add it to the signature so that we can refer to it from IL.
parameters[0] = Context.GetWellKnownType(WellKnownType.IntPtr);
yowl commented 2 years ago

Debugging a bit more carefully, bool hasHiddenParameter = !suppressHiddenArgument && method.RequiresInstArg(); does get set to true for the callee, where the problem is, but then is unset just below with this comment:

                // Some intrinsics will beg to differ about the hasHiddenParameter decision
#if !READYTORUN
                if (_compilation.TypeSystemContext.IsSpecialUnboxingThunkTargetMethod(method))
                    hasHiddenParameter = false;

I'm not sure what that means. @MichalStrehovsky could you elaborate on why this is not regarded as a hidden parameter?

This looks like the origin: https://github.com/dotnet/corert/pull/2694

MichalStrehovsky commented 2 years ago

The special unboxing thunks are for instance methods on generic valuetypes. For example:

object obj = default(Foo<string>);

// This is the problem handled by the special unboxing thunk
obj.ToString();

struct Foo<T>
{
    public virtual string ToString() => typeof(T).ToString();
}

Instance methods on valuetypes have their this typed as a ref to the first field of the valuetype (difference from reference types). They also expect a generic context as the hidden argument (reference types do not). But at the point when we're calling the ToString method above, we'll load a boxed valuetype and no generic parameter. So the thunk will extract the EEType/MethodTable pointer from the this parameter and pass it as the generic context, it will pass a pointer to the first field of the object as the actual this pointer, and pass the rest of parameters if there's any. The thunk doesn't have a hidden parameter because it gets it from the this - the this of the thunk is a reference type, not a value type (boxed valuetype is a reference type).

This looks like the origin: https://github.com/dotnet/corert/pull/2694

That's just a refactoring. It was introduced here: https://github.com/dotnet/corert/pull/2566

yowl commented 2 years ago

But it's the target, "[S.P.CoreLib]System.ValueTuple``2<System.__Canon,System.__Canon>.ToString(native int)" that gets it's hasHiddenParameter set to false. I.e the instance method on the valuetype, which as you say expects a generic context.

image

yowl commented 2 years ago

Maybe the key here is the phrase has an explicit generic context parameter in that PR

MichalStrehovsky commented 2 years ago

That's the fake target method that is described in the comments for the code added in the pull request.

Notice that signature has a native int parameter. The context is part of the IL signature - this is required so that the thunk can be written in IL.

The actual method that will be called doesn't have the parameter in IL and reports hasHiddenParameter == true instead. But the IL thunk wouldn't be able to address the context parameter of such method.

yowl commented 2 years ago

Thanks, Sorry I was a bit slow here. I now get the connection to @SingleAccretion comment above. Is the problem for LLVM then this code:

        case CorInfoType::CORINFO_TYPE_NATIVEINT:  // TODO: Wasm64 - what does NativeInt mean for Wasm64
            return Type::getInt32Ty(_llvmContext);

If native int were changed to a i8* would that break a whole load of s..tuff I wonder?

MichalStrehovsky commented 2 years ago

Yeah, I don't have a good sense for what types we use to represent the generic context. In terms of IL, unmanaged pointers and native int are all typed as native int on the IL evaluation stack so we tend to use them interchangeably. int32 and int64 are different and we wouldn't use them in the generated IL when we mean "pointer".

SingleAccretion commented 2 years ago

Another (more complex, but may be more desirable) way to solve the unboxing thunk problem would be to introduce an intrinsic that imports as the generic context and use that instead of the signature mangling.

If native int were changed to a i8* would that break a whole load of s..tuff I wonder?

Yeah, CORINFO_TYPE_NATIVEINT is a nint, aka an integer, not a pointer. It would not be fruitful to try representing it as one.

yowl commented 2 years ago

It would not be fruitful to try representing it as one.

Indeed I did try, and it was not fruitful. :-)

yowl commented 2 years ago

Thanks all, I've put a call in to IsSpecialUnboxingThunkTargetMethod to handle this for now. It's maybe not great from a compiler performance point of view.