Open amonakov opened 1 year ago
Integer pcmpeq* with source=dest sets destination to all-ones without dependency on source (but still occupies an execution unit). For example, the following loop runs at one cycle per iteration on Skylake, while uiCA predicts two:
pcmpeq*
loop: vpcmpeqd xmm0, xmm0, xmm0 vpor xmm0, xmm0, xmm0 dec ecx jnz loop
Integer
pcmpeq*
with source=dest sets destination to all-ones without dependency on source (but still occupies an execution unit). For example, the following loop runs at one cycle per iteration on Skylake, while uiCA predicts two: