Closed eriksjolund closed 7 years ago
Hi, thanks for the bug report.
VPBLENDD operates only on immediate masks that are encoded into the instruction, much like VPSHUFD or VSHUFPS perform shuffling. It's not possible perform blending using that instruction when the mask is encoded into a register.
I believe the instruction VPBLENDD is faster than the instruction PBLENDVB. The first instruction only handles dwords but the second instruction handles bytes.
For that reason I thought that a
simdpp::blend()
that makes use of amask_int32<8>
would be compiled into a VPBLENDD instead of a PBLENDVB.My test program gets compiled into PBLENDVB:
Do you know why PBLENDVB is being used and not VPBLENDD?