Open Quuxplusone opened 6 years ago
This is a good idea.
From a design perspective I think it's better to have a script that runs llvm-
mca and llvm-exegesis separately so that the two tools don't depend on each
other.
For llvm-exegesis the inputs would be:
- the snippet
- the state of the registers and FPU on entry
I'm unsure how precise we can be on the number of cycles/uOps.
- for one execution the numbers will be noisy (they will include setup and counter on/off time)
- for many executions it will be more precise but we need to remove the setup and loop instructions.
(In reply to Guillaume Chatelet from comment #1)
> This is a good idea.
>
> From a design perspective I think it's better to have a script that runs
> llvm-mca and llvm-exegesis separately so that the two tools don't depend on
> each other.
>
I think that makes sense. Internally we're using a script that compares 'real-
world' and semi-randomized instruction sequences to compare llvm-mca's
estimated IPC against the measured IPC using Linux perf and a simple asm
harness to run the snippet in a loop. We're getting some nice results from
that; potentially we should consider upstreaming that to llvm/utils or similar.
Right now it's quite btver2 specific though.
We could certainly consider expanding that to incorporate llvm-exegesis results.
For a given code snippet, it would be very useful for llvm-exegesis to compare the actual performance vs llvm-mca's prediction.
This could be driven by code snippets from real code or possibly from a fuzzer