llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.63k stars 11.83k forks source link

[llvm-exegesis] Output the jit asm snippet #36525

Open RKSimon opened 6 years ago

RKSimon commented 6 years ago
Bugzilla Link 37177
Version trunk
OS Windows NT
CC @adibiagio,@legrosbuffle,@topperc,@filcab,@gchatelet,@gregbedwell,@rotateright

Extended Description

llvm-exegesis gives us the 'asm excerpt' for the opcode benchmark that is run under jit, but it'd be very useful if this was available as disassembly as well - either by default or just under a verbose mode [Bug #​37049].

Maybe 2 iterations of the asm should be enough to help see what is being run?

gchatelet commented 6 years ago

Would it be possible to adjust this to a more standard at&t format?

Would you mind creating a bug for this and assign it to me? It will require assembling and disassembling the snippet (the current output is not asm but a dump of MCInsts).

Please could you also indicate the initial register values?

I'm working on providing some valid/interesting initial values. Right now it's not set, it's an issue for divide for instance that quickly degenerate to denormal.

For this I need to encode the operand semantic in the TD files. I'll sum up my thoughts in a document.

If the values are randomized with each iteration, this should be indicated as well.

It is, I'm also working on providing a seed to make the results reproducible.

Finally - it's a really bad idea to mix VEX (VPEXTRDrr) and non-VEX (PINSRDrr) like that, especially on Intel, we should use VPINSRDrr in that case.

Are you referring to this? https://stackoverflow.com/questions/41303780/why-is-this-sse-code-6-times-slower-without-vzeroupper-on-skylake

This is an interesting challenge because we know nothing about encoding when we select individual instructions. I'd need to come up with a generic scheme to discard such configuration.

Or we could keep them and it would show up as different clusters in the analysis tool? I like the idea of sampling and see the patterns emerge.

RKSimon commented 6 years ago

I saw that there is a snippet now instead of an asm excerpt - thank you!

llvm-exegesis -benchmark-mode=latency -opcode-name=VPEXTRDrr

VPEXTRDrr R11D, XMM7, 1 PINSRDrr XMM7, XMM7, R11D, 1

Would it be possible to adjust this to a more standard at&t format?

VPEXTRD $1, %XMM7, %R11D PINSRD $1, %R11D, %XMM7

Note the dst/src0 merging in PINSRD!

Please could you also indicate the initial register values? If the values are randomized with each iteration, this should be indicated as well.

Something like:

XMM7 = 0xFFFFFFFF,0xFFFFFFFF,0xFFFFFFFF,0xFFFFFFFF R11D = 0xFFFFFFFF

Finally - it's a really bad idea to mix VEX (VPEXTRDrr) and non-VEX (PINSRDrr) like that, especially on Intel, we should use VPINSRDrr in that case.

gchatelet commented 6 years ago

I agree that would be useful, in the meantime you can use https://onlinedisassembler.com/odaweb/ and copy/paste the bytes.

FYI : I'm currently reworking the code completely to add support for instructions that can't be serial by themselves (i.e. need another instruction to make the execution sequential).

RKSimon commented 6 years ago

assigned to @gchatelet