Closed yowl closed 11 months ago
Depends on what the actual problem is.
If it's that we're creating unused alloca
s (which I would guess is what is happening), the place that is creating them should be checking lvRefCnt
, yes (it is what ordinary codegen does).
In createAllocasForLocalsWithAddrOp
it then goes on to hit the assert :
if (varDsc->lvIsParam)
{
LlvmArgInfo argInfo = getLlvmArgInfoForArgIx(lclNum);
assert(argInfo.IsLlvmArg())
It does not seem right that getLlvmArgInfoForArgIx
effectively fails if the shadow stack local is not referenced.
The next assert after this one is https://github.com/dotnet/runtimelab/blob/1960cfa58f3aa2ab46355947c19b7cfd33dbe60b/src/coreclr/jit/importer.cpp#L17454 Is this related to tail calls? Tail calls are at phase 3 for WebAssembly, so would be good not to do anything that makes enabling them more difficult.
That looks like some importer code being all too aware of ABI details to me, yikes...
LLVM should be taking the "no return buffer path" there. I think you can add TARGET_WASM
to the #elif defined(TARGET_X86)
(in general our "virtual" (before lower) ABI is very similar to x86).
^ Above not true, let me look again.
Edit: well, we could add the #ifdef
to x86, or any other cases where the struct is treated as by-value. But we should make sure that retTypeDesc.GetReturnRegCount()
is 1
.
It does not seem right that
getLlvmArgInfoForArgIx
effectively fails if the shadow stack local is not referenced.
True, it could contain:
if (lclNum == _shadowStackLclNum)
{
llvmArgInfo.m_argIx = 0;
llvmArgInfo.m_shadowStackOffset = 0;
return llvmArgInfo;
}
Which would stop it asserting, but it would then create an alloca. We never want an alloca for the shadow stack do we?
Yes. So the right fix is to add both the ref count check and the shadow stack code to getLlvmArgInfoForArgIx
.
That looks like some importer code being all too aware of ABI details to me, yikes...
~LLVM should be taking the "no return buffer path" there. I think you can add
TARGET_WASM
to the#elif defined(TARGET_X86)
(in general our "virtual" (before lower) ABI is very similar to x86).~^ Above not true, let me look again.
Edit: well, we could add the
#ifdef
to x86, or any other cases where the struct is treated as by-value. But we should make sure thatretTypeDesc.GetReturnRegCount()
is1
.
Wasm has multivalue return values, https://github.com/WebAssembly/multi-value, is that relevant here?
I don't think, no. We're constrained by .NET semantics here, and that only allows one return value.
Perhaps we could use it in the future for some intrinsic-like functions/types, but even then, we'd still need to represent it as one "tuple" in the front-end of the compiler.
(Does LLVM expose this multi-value functionality at all?)
(Does LLVM expose this multi-value functionality at all?)
I think so , but apparently we just need to return structs and clang does the rest https://twitter.com/tlively52/status/1229944658354032640
Edit: well, we could add the
#ifdef
to x86, or any other cases where the struct is treated as by-value. But we should make sure thatretTypeDesc.GetReturnRegCount()
is1
.
The struct on the method I'm looking at
Invoking compiler for the inlinee method System.Span`1:Slice(int):System.Span`1[System.Char]:this :
IL to import:
IL_0000 00 nop
IL_0001 03 ldarg.1
IL_0002 02 ldarg.0
IL_0003 7b 87 01 00 0a ldfld 0xA000187
IL_0008 fe 03 cgt.un
IL_000a 0a stloc.0
IL_000b 06 ldloc.0
IL_000c 2c 06 brfalse.s 6 (IL_0014)
IL_000e 28 a7 15 00 06 call 0x60015A7
IL_0013 00 nop
IL_0014 02 ldarg.0
IL_0015 7c 86 01 00 0a ldflda 0xA000186
IL_001a 28 81 01 00 0a call 0xA000181
IL_001f 03 ldarg.1
IL_0020 e0 conv.u
IL_0021 28 3b 01 00 2b call 0x2B00013B
IL_0026 02 ldarg.0
IL_0027 7b 87 01 00 0a ldfld 0xA000187
IL_002c 03 ldarg.1
IL_002d 59 sub
IL_002e 73 2e 01 00 0a newobj 0xA00012E
IL_0033 0b stloc.1
IL_0034 2b 00 br.s 0 (IL_0036)
IL_0036 07 ldloc.1
IL_0037 2a ret
Is a Span<Char>
. The problem I'm having is that because the type appears as TYP_UNKNOWN
here I have retTypeDesc.GetReturnRegCount() == 0
so it goes to the else
anyway. Maybe I should be investigate why its TYP_UNKNOWN
Yes, that does not seem right. It should be TYP_STRUCT
.
Well, that's not too complicated. InitializeStructReturnType
is basically a nop when FEATURE_MULTIREG_RET
is not defined.
Maybe I can turn on
We already set it to FEATURE_MULTIREG_RET
and for LLVM return SPK_ByValue
in a single "register" or getReturnTypeForStruct
is SPK_PrimitiveType
where for other targest it is SPK_ByValueAsHfa
or SPK_ByValue
.SPK_ByValue
so I'll go with that.
What is V04
in here?
------------ BB01 [000..010) (return), preds={} succs={}
N004 ( 3, 2) [000001] -------N---- t1 = LCL_VAR_ADDR byref V01 arg1
* long V01._ticks (offs=0x00) -> V04 tmp1
/--* t1 byref
N006 ( 9, 7) [000005] n----------- t5 = * OBJ struct<System.TimeSpan, 8> <l:$102, c:$103>
N007 ( 1, 1) [000002] ------------ t2 = CNS_INT int 62 $44
[000022] ------------ t22 = LCL_VAR int V07 rat0 u:1
[000023] ------------ t23 = CNS_INT int 12
/--* t22 int
+--* t23 int
[000024] ------------ t24 = * ADD int
/--* t24 int
[000025] ------------ t25 = * PUTARG_TYPE int
/--* t5 struct
[000026] ------------ t26 = * PUTARG_TYPE struct
/--* t2 int
[000027] ------------ t27 = * PUTARG_TYPE int
/--* t25 int arg-1
+--* t26 struct arg0
+--* t27 int arg1
N008 ( 24, 12) [000003] --CXG------- t3 = * CALL int System.Threading.Tasks.Task.ValidateTimeout $141
I think the method is the bool overload (S_P_CoreLib_System_Threading_Tasks_Task_1<Bool>__WaitAsync_1
):
public new Task<TResult> WaitAsync(TimeSpan timeout) =>
WaitAsync(ValidateTimeout(timeout, ExceptionArgument.timeout), default);
V04
has lvIsParam == true
and looks like this in the debugger
▶ | varDsc | 0x000001ff432194e0 [V4: TYP_LONG-"field V01._ticks (fldOffset=0x0)"] | LclVarDsc *
and looks like some kind of optimisation, pulling out the only field _ticks
in the TimeSpan
? The problem is that because it is a parameter we expect to find it as an LLVM arg but its not there. _sigInfo.numArgs == 2
, so it never gets a lvLlvmArgNum
and looks like some kind of optimisation
Yep, it is a promoted struct field.
Promotion "disassembles" structs into constituent fields and the parent local can end in one of two states:
1) Completely eliminated from the IR (independent promotion). 2) Not completely eliminated (dependent promotion).
From the backend perspective, we should treat independently promoted fields as ordinary locals, and dependently promoted ones as being modified with stores to their fields (the field LCL_VAR
s become "effective" LCL_FLD
s). I think we should undo dependent promotion in lowering and replace all of the fields with actual LCL_FLD
s, for simplicity.
The backend must also ensure that fields from arguments get "homed" into their locations properly. For independent promotion, I propose we lower them into the following:
;; Prolog block
var fieldOne = arg.FieldOne; (LCL_FLD) // Clear lvIsParam, now it is an ordinary local
var fieldTwo = arg.FieldTwo; (LCL_FLD) // Clear lvIsParam, now it is an ordinary local
< ... etc ... >
That entails supporting LCL_FLD
in codegen. For SSA locals that means extractvalue
I'd imagine, for isLlvmFrameLocal
- loads.
I think extractvalue
is what I did in the IL module, but it wont work for unions as the parameter is an element index. So for Register
we only have 2 fields: 2 i64
s . If you want the second byte, you can't do it from an extractvalue
. Its easier to bitcast
and gep
I think.
Edit. that wont work either though, I'll see what clang does for C
Ugh, this looks hard:
struct bytes
{
char b1;
char b2;
char b3;
};
union u
{
long l;
struct bytes someBytes;
};
void access(union u passedUnion)
{
long l2 = passedUnion.l;
char c = passedUnion.someBytes.b2;
}
int main()
{
union u unionVar;
unionVar.l = 123;
access(unionVar);
return 0;
}
LLVM:
%union.u = type { i32 }
%struct.bytes = type { i8, i8, i8 }
; Function Attrs: noinline nounwind optnone
define void @access(%union.u* byval(%union.u) align 4 %0) #0 {
%2 = alloca i32, align 4
%3 = alloca i8, align 1
%4 = bitcast %union.u* %0 to i32*
%5 = load i32, i32* %4, align 4
store i32 %5, i32* %2, align 4
%6 = bitcast %union.u* %0 to %struct.bytes*
%7 = getelementptr inbounds %struct.bytes, %struct.bytes* %6, i32 0, i32 1
%8 = load i8, i8* %7, align 1
store i8 %8, i8* %3, align 1
ret void
}
; Function Attrs: noinline nounwind optnone
define i32 @main() #0 {
%1 = alloca i32, align 4
%2 = alloca %union.u, align 4
store i32 0, i32* %1, align 4
%3 = bitcast %union.u* %2 to i32*
store i32 123, i32* %3, align 4
call void @access(%union.u* byval(%union.u) align 4 %2)
ret i32 0
}
In C it's easier as you know all the different unions up front, but for C# explicit layouts.... I wish I hadn't looked at this now :-)
Or maybe the original idea of bitcasts and geps would be ok. We dont do this parameter passing though %union.u* byval(%union.u)
Or maybe the original idea of bitcasts and geps would be ok
I would imagine it should work if we spill LCL_FLD
-accessed locals to alloca
s: get the address, bitcast to i8*
, add the offset, bitcast to <LCL_FLD type>*
, then load.
Not as sophisticated a scheme as what Clang does, but can always be tuned better later (say, don't spill if only proper fields are acessed).
That makes sense. Do you see this as lowering or codegen?
In codegen, like other isLlvmFrameLocal
s.
If I look at this IR
[000323] ------------ IL_OFFSET void IL offset: 0x0
N001 ( 1, 1) [000007] ------------ t7 = CNS_INT int 0 $41
/--* t7 int
N003 ( 5, 4) [000008] DA---------- * STORE_LCL_VAR struct<System.DateTimeOffset, 16> V00 tmp0
N001 ( 14, 2) [000002] H-CXG------- t2 = CALL help byref HELPER.CORINFO_HELP_READYTORUN_STATIC_BASE Zero Fseq[Zero, _ticks] $200
[000328] ------------ t328 = LCL_VAR int V35 rat0 u:1
[000327] ------------ t327 = CNS_INT int 0
/--* t328 int
+--* t327 int
[000329] ------------ t329 = * ADD int
/--* t329 int
+--* t2 byref
N003 ( 14, 3) [000312] -A-XG------- * STOREIND byref
[000331] ------------ t331 = LCL_VAR int V35 rat0 u:1
[000330] ------------ t330 = CNS_INT int 0
/--* t331 int
+--* t330 int
[000332] ------------ t332 = * ADD int
/--* t332 int
N004 ( 1, 1) [000313] ------------ t313 = * IND byref Zero Fseq[Zero, _ticks]
/--* t313 byref
N006 ( 18, 6) [000261] D--XG------- t261 = * IND long <l:$284, c:$283>
/--* t261 long
N008 ( 18, 6) [000262] DA-XG------- * STORE_LCL_VAR long V21 tmp21 d:1
N002 ( 3, 2) [000058] -------N---- t58 = LCL_VAR_ADDR byref V03 tmp3
* long V03._ticks (offs=0x00) -> V21 tmp21
/--* t58 byref
N004 ( 9, 7) [000061] n----------- t61 = * OBJ struct<System.TimeSpan, 8> <l:$c3, c:$c4>
And the promoted struct V03
I can see that for node 58:
N002 ( 3, 2) [000058] -------N---- t58 = LCL_VAR_ADDR byref V03 tmp3
* long V03._ticks (offs=0x00) -> V21 tmp21
It is marked as promoted and has some fields in its LclVarDsc
that point back to the parent
But in this case, I guess I have 2 questions, is this dependent or independent? I can see that in node 61 OBJ
it is using the struct, so this is dependent? However V21
is handled in llvm.cpp as a local and mapped into _localsMap
using it's SSA pair, there is no alloca for it.
Is the process therefore
LCL_FLD_ADDR
V21
to an alloca because it now fits the criteria in https://github.com/dotnet/runtimelab/issues/1859#issuecomment-1061709865I think that's what I understand by the comments above. Apologies if I've missed something obvious.
is this dependent or independent
Dependent. Independent promotion would have no references to the parent in the IR.
There is a method on Compiler
, lvaGetPromotionType
that distinguishes between the two by the present of lvDoNotEnregister
.
Convert node
[000058]
to a LCL_FLD_ADDR
I don't think that's necessary. Dependent promotion in our case would imply that we will only reference the parent, and [000058]
is already doing just that.
Convert
V21
to an alloca because it now fits the criteria in
Yes, the idea is to turn field locals referencing dependently promoted parents "back" into LCL_FLD
s, which then implies that the parent locals would live on the LLVM frame (for now, at least).
I'm afraid I'm still a bit stuck. With the above IR, for the LCL_VAR_ADDR
N002 ( 3, 2) [000058] -------N---- t58 = LCL_VAR_ADDR byref V03 tmp3
* long V03._ticks (offs=0x00) -> V21 tmp21
If I look at V03
, the struct, 0x000001df43f5c040 [V3: TYP_STRUCT-"Inlining Arg"]
and its promotion type I get PROMOTION_TYPE_INDEPENDENT
because lvDoNotEnregister == 0
For the field, V21
, 0x0000029762236710 [V21: TYP_LONG-"field V03._ticks (fldOffset=0x0)"]
it is PROMOTION_TYPE_NONE
I find the setting/nomenclature of lvParentLcl
a bit confusing because the "parent" of V03
, the struct, is 21
, the field?
Also as this IR wants the address of the field, what should I be putting back to a LCL_FLD
, as I want the address not the field value?
:-/
If I look at
V03
, the struct,0x000001df43f5c040 [V3: TYP_STRUCT-"Inlining Arg"]
and its promotion type I getPROMOTION_TYPE_INDEPENDENT
becauselvDoNotEnregister == 0
It looks like a bug. The frontend is supposed to mark all dependently promoted structs with lvDoNotEnregister
(SSA breaks otherwise), except a few special cases (CanBeReplacedWithItsField
structs and multi-reg cases).
Could you share the full dump?
I find the setting/nomenclature of
lvParentLcl
a bit confusing because the "parent" of V03 , the struct, is 21, the field?
That ought not to be the case, lvParentLcl
for V21
should be V03
(the struct), for V03
itself - BAD_VAR_NUM
(we don't support recursive promotion).
Also as this IR wants the address of the field, what should I be putting back to a
LCL_FLD
, as I want the address not the field value?
For the address node that refers to the parent, it should be left alone. Address nodes that refer to field locals should be morphed into LCL_FLD_ADDR
.
It looks like a bug.
Maybe because it is the result of inlining?
double checking I'm looking at the right thing:
Ok, I see what is going on. This is args morphing's fault more or less, it doesn't mark the parent local DNER because it doesn't try to retype it (due to the "LLVM ABI" that we have).
There are two ways to solve this problem:
1) Pessimistic but simple: do a lowering pre-pass for LLVM identifying all dependently promoted locals "manually". This would be pretty impactful CQ-wise.
2) Support decomposition of promoted locals with FIELD_LIST
s, similar to how x86 and other targets (Unix x64, ARM64, ARM) with stack-based args operate. Not sure how we'd map this to LLVM (build it up with with a series of insertvalue
s...?).
For reference, a FIELD_LIST
represents a collection of values (most often promoted fields, but they can be any values in principle):
FIELD_LIST struct
LCL_VAR V04 <promoted field at offset 0>
LCL_VAR V05 <promoted field at offset 4>
LCL_VAR V06 <promoted field at offset 8>
There is x86 code which does what we need in args morphing if we decide to go this route:
Sounds like the feeling is that option 2 is best. Are you saying we can just || defined (TARGET_WASM)
this block? I did try that, and the FIELD_LIST
looks like is it create fine, however at line ,
it evaluates true, reMorphing == false
, call->fgArgInfo->HasRegArgs() == true
and in EvalArgsToTemps
which asserts at
I'm not helping much at the moment, but happy to be a code monkey for this if that's ok.
Sounds like the feeling is that option 2 is best.
It is certainly better CQ-wise.
I did try that, and the FIELD_LIST looks like is it create fine, however, there are problems
We should be taking the x86 path here:
I. e. all of our args should be "on stack". It will take some tweaking of fgInitArgInfo
to steer it in the right direction. Wonder if just always returning false
from isRegParamType
under LLVM will be enough.
just always returning
false
fromisRegParamType
under LLVM will be enough
Looks interesting:
------------ BB01 [000..042) (return), preds={} succs={}
[000323] ------------ IL_OFFSET void IL offset: 0x0
N001 ( 1, 1) [000007] ------------ t7 = CNS_INT int 0 $41
/--* t7 int
N003 ( 5, 4) [000008] DA---------- * STORE_LCL_VAR struct<System.DateTimeOffset, 16> V00 tmp0 d:1
N001 ( 14, 2) [000002] H-CXG------- t2 = CALL help byref HELPER.CORINFO_HELP_READYTORUN_STATIC_BASE Zero Fseq[Zero, _ticks] $200
/--* t2 byref
N003 ( 14, 3) [000312] DA-XG------- * STORE_LCL_VAR byref V33 cse0 d:1
N004 ( 1, 1) [000313] ------------ t313 = LCL_VAR byref V33 cse0 u:1 Zero Fseq[Zero, _ticks] $200
/--* t313 byref
N006 ( 18, 6) [000261] ---XG------- t261 = * IND long <l:$284, c:$283>
/--* t261 long
N008 ( 18, 6) [000262] DA-XG------- * STORE_LCL_VAR long V21 tmp21 d:1
N001 ( 1, 1) [000058] -------N---- t58 = LCL_VAR long V21 tmp21 u:1 <l:$280, c:$2c0>
/--* t58 long
N002 ( 1, 1) [000266] -c---------- t266 = * FIELD_LIST struct $c1
/--* t266 struct arg0
N003 ( 18, 4) [000059] --CXG------- t59 = * CALL int System.DateTimeOffset.ValidateOffset $300
This case is easier perhaps as ValidateOffset
is
define i16 @S_P_CoreLib_System_DateTimeOffset__ValidateOffset(i8* %0, i64 %1)
i.e, there is no struct type, on the other hand if there was a call to this using a FIELD_LIST
define i8 @S_P_CoreLib_System_TimeZoneInfo__GetIsInvalidTime(i8* %0, i64 %1, %"[S.P.CoreLib]System.Globalization.DaylightTimeStruct" %2)
As you say we may have to alloca
the type from the CALL
, load
it, then insertvalue
. Doesn't seem super efficient, but should be doable as a starting point.
I've had a stab at this over at https://github.com/yowl/runtimelab/tree/decomp-struct. I have no PR yet. For System.ValueType:GetHashCode
I'm hitting an assert before getting to the LLVM compilation.
The dump is
E:\GitHub\runtimelab\src\tests\nativeaot\SmokeTests\HelloWasm\HelloWasm.cs(1477): Trim analysis warning IL2070: Program.NewMethod(Type,ClassForMetaTests): 'this' argument does not satisfy 'DynamicallyAccessedMemberTypes.PublicMethods' in call to 'System.Type.GetMethod(String)'. The parameter '#0' of method 'Program.NewMethod(Type,ClassForMetaTests)' does not have matching annotations. The source value must declare at least the same requirements as those declared on the target location it is assigned to.
E:\GitHub\runtimelab\src\tests\nativeaot\SmokeTests\HelloWasm\HelloWasm.cs(1483): Trim analysis warning IL2070: Program.NewMethod(Type,ClassForMetaTests): 'this' argument does not satisfy 'DynamicallyAccessedMemberTypes.PublicMethods' in call to 'System.Type.GetMethod(String)'. The parameter '#0' of method 'Program.NewMethod(Type,ClassForMetaTests)' does not have matching annotations. The source value must declare at least the same requirements as those declared on the target location it is assigned to.
E:\GitHub\runtimelab\src\libraries\System.Private.CoreLib\src\System\Resources\ManifestBasedResourceGroveler.cs(238): Trim analysis warning IL2026: System.Resources.ManifestBasedResourceGroveler.CreateResourceSet(Stream,Assembly): Using method 'System.Resources.ManifestBasedResourceGroveler.InternalGetResourceSetFromSerializedData(Stream,String,String,ResourceManagerMediator)' which has 'RequiresUnreferencedCodeAttribute' can break functionality when trimming application code. The CustomResourceTypesSupport feature switch has been enabled for this app which is being trimmed. Custom readers as well as custom objects on the resources file are not observable by the trimmer and so required assemblies, types and members may be removed.
****** START compiling System.ValueType:GetHashCode():int:this (MethodHash=fd70cd9a)
Generating code for Windows wasm
OPTIONS: compCodeOpt = SMALL_CODE
OPTIONS: compDbgCode = false
OPTIONS: compDbgInfo = false
OPTIONS: compDbgEnC = false
OPTIONS: compProcedureSplitting = false
OPTIONS: compProcedureSplittingEH = false
OPTIONS: Jit invoked for ngen
IL to import:
IL_0000 02 ldarg.0
IL_0001 28 62 02 00 06 call 0x6000262
IL_0006 0b stloc.1
IL_0007 12 01 ldloca.s 0x1
IL_0009 fe 16 36 00 00 02 constrained. 0x2000036
IL_000f 6f 6c 02 00 06 callvirt 0x600026C
IL_0014 0a stloc.0
IL_0015 06 ldloc.0
IL_0016 02 ldarg.0
IL_0017 28 bd 05 00 06 call 0x60005BD
IL_001c 61 xor
IL_001d 0a stloc.0
IL_001e 06 ldloc.0
IL_001f 2a ret
lvaSetClass: setting class for V00 to (4000000000420028) System.ValueType
'this' passed in register rcx
*************** In compInitDebuggingInfo() for System.ValueType:GetHashCode():int:this
*************** In fgFindBasicBlocks() for System.ValueType:GetHashCode():int:this
weight= 10 : state 3 [ ldarg.0 ]
weight= 79 : state 40 [ call ]
weight= 34 : state 12 [ stloc.1 ]
weight= 61 : state 19 [ ldloca.s ]
weight=161 : state 190 [ constrained -> callvirt ]
weight= 20 : state 199 [ stloc.0 -> ldloc.0 ]
weight= 10 : state 3 [ ldarg.0 ]
weight= 79 : state 40 [ call ]
weight= 35 : state 85 [ xor ]
weight= 20 : state 199 [ stloc.0 -> ldloc.0 ]
weight= 19 : state 42 [ ret ]
Jump targets:
none
New Basic Block BB01 [0000] created.
BB01 [000..020)
Inline candidate callsite is hot. Multiplier increased to 3.
Callsite has profile data: 1. Multiplier limited to 12.6.
calleeNativeSizeEstimate=528
callsiteNativeSizeEstimate=85
benefit multiplier=12.6
threshold=1071
Native estimate for function size is within threshold for inlining 52.8 <= 107.1 (multiplier = 12.6)
IL Code Size,Instr 32, 13, Basic Block count 1, Local Variable Num,Ref count 3, 8 for method System.ValueType:GetHashCode():int:this
OPTIONS: opts.MinOpts() == false
Basic block list for 'System.ValueType:GetHashCode():int:this'
-----------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd weight lp [IL range] [jump] [EH region] [flags]
-----------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000] 1 1 [000..020) (return)
-----------------------------------------------------------------------------------------------------------------------------------------
*************** Starting PHASE Pre-import
*************** Finishing PHASE Pre-import
*************** Starting PHASE Profile incorporation
BBOPT not set
*************** Finishing PHASE Profile incorporation [no changes]
*************** Starting PHASE Importation
*************** In impImport() for System.ValueType:GetHashCode():int:this
impImportBlockPending for BB01
Importing BB01 (PC=000) of 'System.ValueType:GetHashCode():int:this'
[ 0] 0 (0x000) ldarg.0
[ 1] 1 (0x001) call 06000262
In Compiler::impImportCall: opcode is call, kind=0, callRetType is struct, structSize is 4
[000001] I-C-G------- * CALL struct System.Object.get_EETypePtr (exactContextHnd=0x4000000000420041)
[000000] ------------ this in rcx \--* LCL_VAR ref V00 this
[ 1] 6 (0x006) stloc.1
[000005] -AC--------- * ASG struct (copy)
[000003] D------N---- +--* LCL_VAR struct<System.EETypePtr, 4> V02 loc1
[000002] --C--------- \--* RET_EXPR struct(inl return expr [000001])
[ 0] 7 (0x007) ldloca.s 1
[ 1] 9 (0x009) constrained. (02000036) callvirt 0600026C
In Compiler::impImportCall: opcode is callvirt, kind=0, callRetType is int, structSize is 0
[000008] I-C-G------- * CALL int System.EETypePtr.GetHashCode (exactContextHnd=0x4000000000420031)
[000007] ------------ this in rcx \--* ADDR byref
[000006] -------N---- \--* LCL_VAR struct<System.EETypePtr, 4> V02 loc1
[ 1] 20 (0x014) stloc.0
[000011] -AC--------- * ASG int
[000010] D------N---- +--* LCL_VAR int V01 loc0
[000009] --C--------- \--* RET_EXPR int (inl return expr [000008])
[ 0] 21 (0x015) ldloc.0
[ 1] 22 (0x016) ldarg.0
[ 2] 23 (0x017) call 060005BD
In Compiler::impImportCall: opcode is call, kind=0, callRetType is int, structSize is 0
[000014] I-C-G------- * CALL int System.ValueType.GetHashCodeImpl (exactContextHnd=0x4000000000420029)
[000013] ------------ this in rcx \--* LCL_VAR ref V00 this
[ 2] 28 (0x01c) xor
[ 1] 29 (0x01d) stloc.0
[000018] -AC--------- * ASG int
[000017] D------N---- +--* LCL_VAR int V01 loc0
[000016] --C--------- \--* XOR int
[000012] ------------ +--* LCL_VAR int V01 loc0
[000015] --C--------- \--* RET_EXPR int (inl return expr [000014])
[ 0] 30 (0x01e) ldloc.0
[ 1] 31 (0x01f) ret
[000020] ------------ * RETURN int
[000019] ------------ \--* LCL_VAR int V01 loc0
*************** Finishing PHASE Importation
Trees after Importation
-----------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd weight lp [IL range] [jump] [EH region] [flags]
-----------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000] 1 1 [000..020) (return) i
-----------------------------------------------------------------------------------------------------------------------------------------
------------ BB01 [000..020) (return), preds={} succs={}
***** BB01
[000001] I-C-G------- * CALL struct System.Object.get_EETypePtr (exactContextHnd=0x4000000000420041)
[000000] ------------ this in rcx \--* LCL_VAR ref V00 this
***** BB01
[000005] -AC--------- * ASG struct (copy)
[000003] D------N---- +--* LCL_VAR struct<System.EETypePtr, 4> V02 loc1
[000002] --C--------- \--* RET_EXPR struct(inl return expr [000001])
***** BB01
[000008] I-C-G------- * CALL int System.EETypePtr.GetHashCode (exactContextHnd=0x4000000000420031)
[000007] ------------ this in rcx \--* ADDR byref
[000006] -------N---- \--* LCL_VAR struct<System.EETypePtr, 4> V02 loc1
***** BB01
[000011] -AC--------- * ASG int
[000010] D------N---- +--* LCL_VAR int V01 loc0
[000009] --C--------- \--* RET_EXPR int (inl return expr [000008])
***** BB01
[000014] I-C-G------- * CALL int System.ValueType.GetHashCodeImpl (exactContextHnd=0x4000000000420029)
[000013] ------------ this in rcx \--* LCL_VAR ref V00 this
***** BB01
[000018] -AC--------- * ASG int
[000017] D------N---- +--* LCL_VAR int V01 loc0
[000016] --C--------- \--* XOR int
[000012] ------------ +--* LCL_VAR int V01 loc0
[000015] --C--------- \--* RET_EXPR int (inl return expr [000014])
***** BB01
[000020] ------------ * RETURN int
[000019] ------------ \--* LCL_VAR int V01 loc0
-------------------------------------------------------------------------------------------------------------------
*************** Starting PHASE Indirect call transform
-- no candidates to transform
*************** Finishing PHASE Indirect call transform [no changes]
*************** Starting PHASE Expand patchpoints
-- no patchpoints to transform
*************** Finishing PHASE Expand patchpoints [no changes]
*************** Starting PHASE Post-import
*************** Finishing PHASE Post-import
*************** Starting PHASE Morph - Init
New BlockSet epoch 1, # of blocks (including unused BB00): 2, bitset array size: 1 (short)
*************** In fgRemoveEmptyBlocks
*************** Finishing PHASE Morph - Init
*************** In fgDebugCheckBBlist
*************** Starting PHASE Morph - Inlining
Expanding INLINE_CANDIDATE in statement STMT00000 in BB01:
[000001] I-C-G------- * CALL struct System.Object.get_EETypePtr (exactContextHnd=0x4000000000420041)
[000000] ------------ this in rcx \--* LCL_VAR ref V00 this
thisArg: is a local var
[000000] ------------ * LCL_VAR ref V00 this
INLINER: inlineInfo.tokenLookupContextHandle for System.Object:get_EETypePtr():System.EETypePtr:this set to 0x4000000000420041:
Invoking compiler for the inlinee method System.Object:get_EETypePtr():System.EETypePtr:this :
IL to import:
IL_0000 02 ldarg.0
IL_0001 7b ac 00 00 04 ldfld 0x40000AC
IL_0006 73 8c 04 00 06 newobj 0x600048C
IL_000b 2a ret
INLINER impTokenLookupContextHandle for System.Object:get_EETypePtr():System.EETypePtr:this is 0x4000000000420041.
*************** In fgFindBasicBlocks() for System.Object:get_EETypePtr():System.EETypePtr:this
Jump targets:
none
New Basic Block BB02 [0001] created.
BB02 [000..00C)
Basic block list for 'System.Object:get_EETypePtr():System.EETypePtr:this'
-----------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd weight lp [IL range] [jump] [EH region] [flags]
-----------------------------------------------------------------------------------------------------------------------------------------
BB02 [0001] 1 1 [000..00C) (return)
-----------------------------------------------------------------------------------------------------------------------------------------
*************** Inline @[000001] Starting PHASE Pre-import
*************** Inline @[000001] Finishing PHASE Pre-import
*************** Inline @[000001] Starting PHASE Profile incorporation
BBOPT not set
Computing inlinee profile scale:
... no callee profile data, will use non-pgo weight to scale
... call site not profiled, will use non-pgo weight to scale
call site count 100 callee entry count 100 scale 1
Scaling inlinee blocks
*************** Inline @[000001] Finishing PHASE Profile incorporation
Trees after Profile incorporation
-----------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd weight lp [IL range] [jump] [EH region] [flags]
-----------------------------------------------------------------------------------------------------------------------------------------
BB02 [0001] 1 1 [000..00C) (return)
-----------------------------------------------------------------------------------------------------------------------------------------
------------ BB02 [000..00C) (return), preds={} succs={}
-------------------------------------------------------------------------------------------------------------------
*************** Inline @[000001] Starting PHASE Importation
*************** In impImport() for System.Object:get_EETypePtr():System.EETypePtr:this
impImportBlockPending for BB02
Importing BB02 (PC=000) of 'System.Object:get_EETypePtr():System.EETypePtr:this'
[ 0] 0 (0x000) ldarg.0
[ 1] 1 (0x001) ldfld 040000AC
[ 1] 6 (0x006) newobj
lvaGrabTemp returning 3 (V03 tmp0) called for NewObj constructor temp.
[000024] IA---------- * ASG struct (init)
[000022] D------N---- +--* LCL_VAR struct<System.EETypePtr, 4> V03 tmp0
[000023] ------------ \--* CNS_INT int 0
0600048C
In Compiler::impImportCall: opcode is newobj, kind=0, callRetType is void, structSize is 0
[000027] I-CXG------- * CALL void System.EETypePtr..ctor (exactContextHnd=0x4000000000420031)
[000026] ------------ this in rcx +--* ADDR byref
[000025] -------N---- | \--* LCL_VAR struct<System.EETypePtr, 4> V03 tmp0
[000021] ---XG------- arg1 \--* FIELD int m_pEEType
[000000] ------------ \--* LCL_VAR ref V00 this
[ 1] 11 (0x00b) ret
Inlinee Return expression (before normalization) =>
[000028] ------------ * LCL_VAR struct<System.EETypePtr, 4> V03 tmp0
E:\GitHub\runtimelab\src\coreclr\jit\importer.cpp:17464
Assertion failed 'iciCall->HasRetBufArg()' in 'System.ValueType:GetHashCode():int:this' during 'Morph - Inlining' (IL size 32)
I tried enabling the if above that is for X86
but that doesn't make much difference as retTypeDesc.GetReturnRegCount()
returns 0
I'm not really sure what our desired path is here. In Debug(although that can't matter I suppose), this method ends up as
define i32 @S_P_CoreLib_System_ValueType__GetHashCode(i8* %0)
but that doesn't make much difference as
retTypeDesc.GetReturnRegCount()
returns 0
Yeah... It's the unfortunate problem that the LLVM struct returns cannot be modeled by the compiler.
So it'd probably come down to adding another #ifdef TARGET_WASM
there with the duplicated code:
assert(!iciCall->HasRetBufArg());
if (fgNeedReturnSpillTemp())
{
if (!impInlineInfo->retExpr)
{
// The inlinee compiler has figured out the type of the temp already. Use it here.
impInlineInfo->retExpr =
gtNewLclvNode(lvaInlineeReturnSpillTemp, lvaTable[lvaInlineeReturnSpillTemp].lvType);
}
}
else
{
impInlineInfo->retExpr = op2;
}
(Or #ifdef
-ing away the return reg count check for WASM in one of existing clauses)
Thanks, that looks pretty tidy now:
define i32 @S_P_CoreLib_System_ValueType__GetHashCode(i8* %0) {
Prolog:
br label %1
1: ; preds = %Prolog
%2 = getelementptr i8, i8* %0, i32 0
%3 = bitcast i8* %2 to i8**
%4 = load i8*, i8** %3, align 4
%5 = bitcast i8* %4 to i32*
%6 = load i32, i32* %5, align 4
%7 = getelementptr i8, i8* %0, i32 4
%8 = inttoptr i32 %6 to i8*
%9 = bitcast i8* %7 to i8**
store i8* %8, i8** %9, align 4
%10 = getelementptr i8, i8* %0, i32 4
%11 = bitcast i8* %10 to i8**
%12 = load i8*, i8** %11, align 4
%13 = getelementptr i8, i8* %12, i32 16
%14 = bitcast i8* %13 to i32*
%15 = load i32, i32* %14, align 4
%16 = getelementptr i8, i8* %0, i32 0
%17 = bitcast i8* %16 to i8**
%18 = load i8*, i8** %17, align 4
%19 = getelementptr i8, i8* %0, i32 8
%20 = getelementptr i8, i8* %0, i32 8
%21 = bitcast i8* %20 to i8**
store i8* %18, i8** %21, align 4
%22 = call i32 @S_P_CoreLib_System_ValueType__GetHashCodeImpl(i8* %19)
%xor = xor i32 %22, %15
ret i32 %xor
}
I want to make a note of some functions that might be worth a sanity check once the compilation is working to I'll do that here and fill out the LLVM when its done.
S_P_CoreLib_System_Collections_Generic_GenericEqualityComparer_1<S_P_CoreLib_System_DateTime>__Equals
define i8 @"S_P_CoreLib_System_Collections_Generic_GenericEqualityComparer_1<S_P_CoreLib_System_DateTime>__Equals"(i8* %0, %"[S.P.CoreLib]System.DateTime" %1, %"[S.P.CoreLib]System.DateTime" %2) {
Prolog:
%3 = alloca %"[S.P.CoreLib]System.DateTime", align 8
store %"[S.P.CoreLib]System.DateTime" %1, %"[S.P.CoreLib]System.DateTime"* %3, align 1
%4 = alloca %"[S.P.CoreLib]System.DateTime", align 8
store %"[S.P.CoreLib]System.DateTime" %2, %"[S.P.CoreLib]System.DateTime"* %4, align 1
br label %5
5: ; preds = %Prolog
%6 = bitcast %"[S.P.CoreLib]System.DateTime"* %4 to i8*
%7 = getelementptr i8, i8* %6, i16 0
%8 = bitcast i8* %7 to i64*
%9 = load i64, i64* %8, align 8
%10 = bitcast %"[S.P.CoreLib]System.DateTime"* %3 to i8*
%11 = getelementptr i8, i8* %10, i16 0
%12 = bitcast i8* %11 to i64*
%13 = load i64, i64* %12, align 8
%and = and i64 %13, 4611686018427387903
%and1 = and i64 %9, 4611686018427387903
%14 = icmp eq i64 %and, %and1
%15 = zext i1 %14 to i8
ret i8 %15
Looks ok visually at least.
Looking at what is wrong with Release config builds and why Optimisation was removed in #1801, this is the first problem.
We have
But when compiling with optimisation, you can get methods with LIR:
In this case the shadow stack arg is not tracked as there are no references,
lvInSsa
is then false, andisLlvmFrameLocal
returns true. Should this method includevarDsc->lvTracked
orlvRefCnt() != 0
orlclNum != _shadowStackLclNum
?cc @SingleAccretion