Open RKSimon opened 6 years ago
Would it be possible to adjust this to a more standard at&t format?
Would you mind creating a bug for this and assign it to me? It will require assembling and disassembling the snippet (the current output is not asm but a dump of MCInsts).
Please could you also indicate the initial register values?
I'm working on providing some valid/interesting initial values. Right now it's not set, it's an issue for divide for instance that quickly degenerate to denormal.
For this I need to encode the operand semantic in the TD files. I'll sum up my thoughts in a document.
If the values are randomized with each iteration, this should be indicated as well.
It is, I'm also working on providing a seed to make the results reproducible.
Finally - it's a really bad idea to mix VEX (VPEXTRDrr) and non-VEX (PINSRDrr) like that, especially on Intel, we should use VPINSRDrr in that case.
Are you referring to this? https://stackoverflow.com/questions/41303780/why-is-this-sse-code-6-times-slower-without-vzeroupper-on-skylake
This is an interesting challenge because we know nothing about encoding when we select individual instructions. I'd need to come up with a generic scheme to discard such configuration.
Or we could keep them and it would show up as different clusters in the analysis tool? I like the idea of sampling and see the patterns emerge.
I saw that there is a snippet now instead of an asm excerpt - thank you!
llvm-exegesis -benchmark-mode=latency -opcode-name=VPEXTRDrr
VPEXTRDrr R11D, XMM7, 1 PINSRDrr XMM7, XMM7, R11D, 1
Would it be possible to adjust this to a more standard at&t format?
VPEXTRD $1, %XMM7, %R11D PINSRD $1, %R11D, %XMM7
Note the dst/src0 merging in PINSRD!
Please could you also indicate the initial register values? If the values are randomized with each iteration, this should be indicated as well.
Something like:
XMM7 = 0xFFFFFFFF,0xFFFFFFFF,0xFFFFFFFF,0xFFFFFFFF R11D = 0xFFFFFFFF
Finally - it's a really bad idea to mix VEX (VPEXTRDrr) and non-VEX (PINSRDrr) like that, especially on Intel, we should use VPINSRDrr in that case.
I agree that would be useful, in the meantime you can use https://onlinedisassembler.com/odaweb/ and copy/paste the bytes.
FYI : I'm currently reworking the code completely to add support for instructions that can't be serial by themselves (i.e. need another instruction to make the execution sequential).
assigned to @gchatelet
Extended Description
llvm-exegesis gives us the 'asm excerpt' for the opcode benchmark that is run under jit, but it'd be very useful if this was available as disassembly as well - either by default or just under a verbose mode [Bug #37049].
Maybe 2 iterations of the asm should be enough to help see what is being run?