rust-lang / stdarch

Rust's standard library vendor-specific APIs and run-time feature detection
https://doc.rust-lang.org/stable/core/arch/
Apache License 2.0
599 stars 260 forks source link

_mm512_set4_epi64 reverses order of arguments #1555

Closed tslnc04 closed 3 months ago

tslnc04 commented 3 months ago

The implementation in core::arch for _mm512_set4_epi64 is

pub unsafe fn _mm512_set4_epi64(d: i64, c: i64, b: i64, a: i64) -> __m512i {
    let r = i64x8::new(d, c, b, a, d, c, b, a);
    transmute(r)
}

so the first argument provided becomes the first lane. However, the Intel Intrinsics Guide defines it as

__m512i _mm512_set4_epi64 (__int64 d, __int64 c, __int64 b, __int64 a)
dst[63:0] := a
dst[127:64] := b
dst[191:128] := c
dst[255:192] := d
dst[319:256] := a
dst[383:320] := b
dst[447:384] := c
dst[511:448] := d
dst[MAX:512] := 0

which means that the last argument provided becomes the first lane.

The implementation for _mm512_set_epi64 is correct though, which leads to a disparity between _mm512_set4_epi64 and _mm512_set_epi64 that doesn't exist in C. I've created this gist to show this difference between C and Rust.