Open Quuxplusone opened 11 years ago
Attached dotps.cpp
(2475 bytes, application/octet-stream): Simple test case
What does gcc do?
(In reply to comment #1)
> What does gcc do?
It seems that gcc 4.7.2 (Ubuntu 64-bit) is also producing the instruction with
a memory operand:
g++ -O3 -std=c++11 -march=native dotps.cpp
movaps (%rdx),%xmm0
dpps $0xff,0x4(%rdi,%rcx,4),%xmm0
I wonder if this slowdown is limited only to dpps - or potentially to other SSE
instructions. I also wonder if this happens on AMD processors. Unfortunately I
do not have access to any AMD processors that I can test on.
Which Core i7, penryn or sandybridge?
AFAIK AMD K8-based microarchitecture might have stronger address generator (than Intel's). Not sure Bulldozer.
(In reply to comment #3)
> Which Core i7, penryn or sandybridge?
>
> AFAIK AMD K8-based microarchitecture might have stronger address generator
> (than Intel's). Not sure Bulldozer.
This is a Sandy Bridge.
dotps.cpp
(2475 bytes, application/octet-stream)