Open BruceForstall opened 4 years ago
@dotnet/jit-contrib I'm currently working on this, but would like to get some early feedback.
For the first iteration of this, I am continuing to use the GC info represented in the JIT data structures, rather than decoding the GC info after it's generated.
In addition, I'm modifying the GC dump to dump the info after the instruction that it pertains to, since it reflects the state after the instruction has been executed (I have always found it confusing that it printed it before). I'm also deleting some information that was redundant, and consolidating multiple updates into a single line.
Here's a snippet of a disassembly (with COMPlus_JitDiffableDasm
and a new COMPlus_JitDisasmWithGC=1
:
mov rcx, gword ptr [rcx]
; gcrRegs +[rcx]
call System.Console:WriteLine(System.String,System.Object,System.Object)
; gcrRegs -[rax rcx rdx rbp r8]
;; bbWeight=0.50 PerfScore 4.25
G_M17768_IG10: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov ecx, edi
call TreeNode:bottomUpTree(int):TreeNode
; gcrRegs +[rax]
Here's the 'before' snippet of the corresponding jitdump for those same instructions:
gcrReg +[rcx]
IN0021: 000093 mov rcx, gword ptr [rcx]
New gcrReg live regs=00000000 {}
gcrReg -[rax]
gcrReg -[rcx]
gcrReg -[rdx]
gcrReg -[rbp]
gcrReg -[r8]
[D1FFAB1E] ptr arg pop 0
IN0022: 000096 call System.Console:WriteLine(System.String,System.Object,System.Object)
;; bbWeight=0.50 PerfScore 4.25
G_M17768_IG10: ; func=00, offs=00009BH, size=0019H, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
IN0023: 00009B mov ecx, edi
New gcrReg live regs=00000001 {rax}
gcrReg +[rax]
[D1FFAB1E] ptr arg pop 0
IN0024: 00009D call TreeNode:bottomUpTree(int):TreeNode
Here's the same snippet with my changes:
IN0021: 000093 mov rcx, gword ptr [rcx]
; gcrRegs +[rcx]
IN0022: 000096 call System.Console:WriteLine(System.String,System.Object,System.Object)
; gcrRegs -[rax rcx rdx rbp r8]
;; bbWeight=0.50 PerfScore 4.25
G_M17768_IG10: ; func=00, offs=00009BH, size=0019H, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
IN0023: 00009B mov ecx, edi
IN0024: 00009D call TreeNode:bottomUpTree(int):TreeNode
; gcrRegs +[rax]
I like it!
I'm modifying the GC dump to dump the info after the instruction that it pertains to, since it reflects the state after the instruction has been executed (I have always found it confusing that it printed it before)
I've also often been confused by this. Does the new behavior match !u -gcinfo
in SOS? Will all output of GC info in JIT dump be consistent here?
Does the new behavior match !u -gcinfo in SOS?
I'm not sure - I'm not much of a user of SOS, but I'll check.
Will all output of GC info in JIT dump be consistent here?
Yes, that's the plan. The output methods are shared, at least for now, and I would presume we'd keep the same approach when/if we actually decode the GC info.
It would be also nice to match where R2RDump puts it.
It would be also nice to match where R2RDump puts it.
I'm not aware of R2RDump - can you point me to it?
It's in the source tree as src\coreclr\src\Tools\r2rdump
AFAICT, R2RDump simply dumps the GC info after the method (and doesn't seem to actually report ranges, just slots) I don't propose to implement interleaving of GC info for R2RDump, so I'm not sure in what way I would match R2RDump.
R2RDump actually interleaves the stuff with disassembly. Here is an example that I've just created from S.P.C.:
void System.Runtime.CompilerServices.NullableAttribute..ctor(byte)
Id: 2
StartAddress: 0x0047EE10
Size: 114 bytes
UnwindRVA: 0x00B74D6C
Version: 1
Flags: 0x03 EHANDLER UHANDLER
SizeOfProlog: 0x0005
CountOfUnwindCodes: 2
FrameRegister: None
FrameOffset: 0x0
PersonalityRVA: 0x3A749
UnwindCode[0]: CodeOffset 0x0005 FrameOffset 0x0000 NextOffset 0x0 Op 48
Debug Info
Bounds:
Native Offset: 0x0, Prolog, Source Types: StackEmpty
Native Offset: 0x1B, IL Offset: 0x0000, Source Types: StackEmpty
Native Offset: 0x1F, IL Offset: 0x0001, Source Types: CallInstruction
Native Offset: 0x25, IL Offset: 0x0006, Source Types: StackEmpty
Native Offset: 0x26, IL Offset: 0x0007, Source Types: StackEmpty
Native Offset: 0x3D, IL Offset: 0x000e, Source Types: SourceTypeInvalid
Native Offset: 0x6B, IL Offset: 0x0017, Source Types: StackEmpty
Native Offset: 0x6C, Epilog, Source Types: StackEmpty
Variable Locations:
Variable Number: 1
Start Offset: 0x0
End Offset: 0x1B
Loc Type: VLT_REG
Register: RDX
Variable Number: 0
Start Offset: 0x0
End Offset: 0x1B
Loc Type: VLT_REG
Register: RCX
Variable Number: 1
Start Offset: 0x1B
End Offset: 0x6C
Loc Type: VLT_STK
Base Register: RBP
Stack Offset: 24
Variable Number: 0
Start Offset: 0x1B
End Offset: 0x6C
Loc Type: VLT_STK
Base Register: RBP
Stack Offset: 16
47ee10: 55 push rbp
UWOP_PUSH_NONVOL RBP(5) 0
47ee11: 48 83 ec 30 sub rsp, 48
UWOP_ALLOC_SMALL 48 0
47ee15: 48 8d 6c 24 30 lea rbp, [rsp + 48]
47ee1a: 33 c0 xor eax, eax
47ee1c: 48 89 45 f8 mov qword ptr [rbp - 8], rax
47ee20: 48 89 45 f0 mov qword ptr [rbp - 16], rax
47ee24: 48 89 4d 10 mov qword ptr [rbp + 16], rcx
47ee28: 89 55 18 mov dword ptr [rbp + 24], edx
47ee2b: 48 8b 4d 10 mov rcx, qword ptr [rbp + 16]
RCX is live
47ee2f: ff 15 bb f3 b8 ff call qword ptr [0xe1f0] // void System.Attribute..ctor() (METHOD_ENTRY_DEF_TOKEN)
RCX is dead
47ee35: 90 nop
47ee36: 48 8b 4d 10 mov rcx, qword ptr [rbp + 16]
RCX is live
47ee3a: 48 89 4d f8 mov qword ptr [rbp - 8], rcx
47ee3e: b9 01 00 00 00 mov ecx, 1
RCX is dead
47ee43: ff 15 47 23 b8 ff call qword ptr [0x1190] // byte[] (NEW_ARRAY)
RAX is live
47ee49: 48 89 45 f0 mov qword ptr [rbp - 16], rax
47ee4d: 48 8b 55 f0 mov rdx, qword ptr [rbp - 16]
RDX is live
47ee51: 33 c9 xor ecx, ecx
47ee53: 3b 4a 08 cmp ecx, dword ptr [rdx + 8]
47ee56: 72 05 jb 0x47ee5d
47ee58: e8 08 b9 bb ff call 0x3a765 // JMP [0x1028]: RNG_CHK_FAIL (HELPER)
RAX is dead
47ee5d: 8b c1 mov eax, ecx
47ee5f: 48 8d 54 02 10 lea rdx, [rdx + rax + 16]
RDX is live
RDX is dead
47ee64: 8b 4d 18 mov ecx, dword ptr [rbp + 24]
47ee67: 88 0a mov byte ptr [rdx], cl
47ee69: 48 8b 55 f8 mov rdx, qword ptr [rbp - 8]
RDX is dead
RDX is live
47ee6d: 48 8d 4a 08 lea rcx, [rdx + 8]
RCX is live
47ee71: 48 8b 55 f0 mov rdx, qword ptr [rbp - 16]
47ee75: ff 15 b5 21 b8 ff call qword ptr [0x1030] // WRITE_BARRIER (HELPER)
RCX is dead
RDX is dead
47ee7b: 90 nop
47ee7c: 48 8d 65 00 lea rsp, [rbp]
47ee80: 5d pop rbp
47ee81: c3 ret
OK, it seems that R2RDump also does the right thing and reports the deltas after the instruction.
However, it individually reports each register or stack location (my changes will report them collectively). Also, the JIT dump reports variables on the stack by their JIT name (e.g. ; GC ptr vars +{V27}
).
I don't see any value in unifying these.
The interleaving of GC info was implemented in #42552, but work remains to incorporate its use in jitutils.
We should also consider whether additional information should be in the assembly dump with COMPlus_JitDisasmWithGC
so that any diff in the GC info is readily visible in an assembly diff.
Currently, when using
COMPlus_JitGCDump
/COMPlus_NgenGCDump
, GC info is displayed in a block at the end of compilation.Create options that allow the information to be interleaved into the asm dumped with
COMPlus_JitDisasm
/COMPlus_NgenDisasm
. These options would be in addition to the current ones, which are still useful.Interleaved output allows us to always add GC info into our asm diffs process. Currently, adding GC info into asm diffs requires enabling adding (back) instruction offsets into the asm output, because GC info is based on asm offsets. But adding these means that as soon as there is a single new instruction, or change in encoding size of an instruction, then all subsequent instructions in the function show up as a diff. This is really noisy. Thus, we rarely if ever pay attention to GC info diffs, which we should.
Note that the JIT dump does emit interleaved GC info as part of the "Instructions as they come out of the scheduler" phase. It would be better if the interleaved GC info dump suggested here come from decoding the fully generated GC info, the way the SOS
!u -gcinfo
command does it. Perhaps the code for!u -gcinfo
could be borrowed for this (see https://github.com/dotnet/diagnostics/pull/403).Changes should be made to jit-diff and possibly jit-analyze (https://github.com/dotnet/jitutils), and to superpmi.py, to use these new options.
category:implementation theme:asmdiffs skill-level:beginner cost:small impact:small