llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.2k stars 11.64k forks source link

[MCA] ADC executed when carry not generated #62507

Open 837951602 opened 1 year ago

837951602 commented 1 year ago
$ llvm-mca kkkk -mcpu=skylake -timeline --timeline-max-iterations=2 --timeline-max-cycles=999
...
Timeline view:
Index     01234567

[0,0]     DeER . .   addq   %r8, %r9
[0,1]     D=eER. .   adcq   %r10, %r11
[0,2]     D==eER .   adcq   %r12, %r13
[0,3]     D===eER.   adcq   %r14, %r15
[0,4]     DeE---R.   incq   %rax
[0,5]     D=eE--R.   adcq   %rbx, %rcx
[1,0]     .DeE--R.   addq   %r8, %r9
[1,1]     .D=eE-R.   adcq   %r10, %r11
[1,2]     .D==eER.   adcq   %r12, %r13
[1,3]     .D===eER   adcq   %r14, %r15
[1,4]     .DeE---R   incq   %rax
[1,5]     .D===eER   adcq   %rbx, %rcx

$ llvm-mca --version
Ubuntu LLVM version 17.0.0
  Optimized build.
  Default target: x86_64-pc-linux-gnu

[0,5] is executed on cycle 2 but relies on result from cycle 4.

Problem found while discussing https://stackoverflow.com/questions/76151320/

topperc commented 1 year ago

We only have an EFLAGS register modeled. INC is modeled as writing EFLAGS, but not reading it. Because it preserves C it should technically read EFLAGS too. CodeGen never relies on INC not updating the C flag so would not generate the code seen here.

From a microarchitecture perspective, skylake renames C separately from OSPAZ. This allows the INC to execute early since it doesn't need to read the flag C to preserve it. There are some older microarchitectures that don't do this.

@adibiagio @RKSimon is there some way we can model this dependency in llvm-mca without affecting the EFLAGs register behavior in CodeGen?

llvmbot commented 1 year ago

@llvm/issue-subscribers-tools-llvm-mca

llvmbot commented 1 year ago

@llvm/issue-subscribers-backend-x86

RKSimon commented 1 year ago

@adibiagio @RKSimon is there some way we can model this dependency in llvm-mca without affecting the EFLAGs register behavior in CodeGen?

Not easily and something like this done just for MCA would be very brittle - I'd love to see EFLAGS remodeled so we can (optionally) update instructions to show which individual flags they read/write/clear/set/undef but that will take some time.

topperc commented 1 year ago

@adibiagio @RKSimon is there some way we can model this dependency in llvm-mca without affecting the EFLAGs register behavior in CodeGen?

Not easily and something like this done just for MCA would be very brittle - I'd love to see EFLAGS remodeled so we can (optionally) update instructions to show which individual flags they read/write/clear/set/undef but that will take some time.

Are there many instruction that leave a flag unmodified instead of undefined?

One weird one I can remember is that shifts don't update any flags on shifts of 0 and the overflow flag is only defined for shifts of 1.

RKSimon commented 1 year ago

As well as INC, RCL/RCR and the ADX instructions are ones that I know of - another part of the problem is that Intel + AMD haven't always matched UNDEF vs passthrough behaviour.

837951602 commented 1 year ago

STC(modify C, preserve OSPAZ) and BT(modify C, preserve Z, undefined OSPA) breaks only reliance of carry flag. On my znver1, they do break.

.section .text
.globl main
main:

stc
adcq %r8, %r9
adcq %r10, %r11
adcq %r12, %r13

jmp main