VcDevel / Vc

SIMD Vector Classes for C++
BSD 3-Clause "New" or "Revised" License
1.45k stars 152 forks source link

question: How to get the underlying vector from a `Vc::simd`? #186

Open twesterhout opened 6 years ago

twesterhout commented 6 years ago

about branch: master

Background

So my question is about "extending" Vc, i.e. wrapping some intrinsics into Vc functions. So, say, I have some C functions:

extern "C" {
#if defined(Vc_HAVE_FULL_SSE_ABI)
__m128  __svml_atan2f4(__m128, __m128);
__m128d __svml_atan22(__m128d, __m128d);
#endif // Vc_HAVE_FULL_SSE_ABI

#if defined(Vc_HAVE_FULL_AVX_ABI)
__m256  __svml_atan2f8(__m256, __m256);
__m256d __svml_atan24(__m256d, __m256d);
#endif // Vc_HAVE_FULL_AVX_ABI

#if defined(Vc_HAVE_FULL_AVX512_ABI)
__m512  __svml_atan2f16(__m512, __m512);
__m512d __svml_atan28(__m512d, __m512d);
#endif // Vc_HAVE_FULL_AVX512_ABI
} // extern "C"

I have written something like this to wrap them:

namespace detail {
#define ATAN2_FN(suffix, type)                                       \
    auto _atan2(type const x, type const y) noexcept->type           \
    {                                                                \
        return __svml_atan2##suffix(x, y);                           \
    }                                                                \
    EXPECTING_SEMICOLON

#if defined(Vc_HAVE_FULL_SSE_ABI)
    ATAN2_FN(f4, __m128);
    ATAN2_FN(2, __m128d);
#endif // Vc_HAVE_FULL_SSE_ABI

#if defined(Vc_HAVE_FULL_AVX_ABI)
    ATAN2_FN(f8, __m256);
    ATAN2_FN(4, __m256d);
#endif // Vc_HAVE_FULL_AVX_ABI

#if defined(Vc_HAVE_FULL_AVX512_ABI)
    ATAN2_FN(f16, __m512);
    ATAN2_FN(8, __m512d);
#endif // Vc_HAVE_FULL_AVX512_ABI
#undef ATAN2_FN
} // namespace detail

template <class T, class Abi>
auto _atan2(Vc::simd<T, Abi> const x,
    Vc::simd<T, Abi> const y) noexcept -> Vc::simd<T, Abi>
{
    return Vc::simd<T, Abi>{
        detail::_atan2(Vc::detail::data(x), Vc::detail::data(y))};
}

Question 1

I'm concerned about the use of Vc::detail::data. Seeing as it's in the detail namespace, I'm probably not supposed to use it. But then what's the correct way to get the underlying vector that I can feed to an intrinsic?

Question 2

And a related situation: consider an intrinsic that expects an __mmask8 or an __mmask16. How can I convert Vc::simd_mask to it?

Thanks in advance!

mattkretz commented 6 years ago

Q1

Yes, don't use anything from the detail namespace unless you are prepared for the breakage from changes to Vc. What you want is static_cast<__m128>(simd<float, sse>). I.e. every simd<T, Abi> can be cast to/from implementation-defined types. In case of sse, avx, and avx512 in Vc, you can cast to the corresponding Intel intrinsics. The fixed_size types can be cast to/from std::array<T, N>.

Q2

Same as for simd, you can simply static_cast. Also note that Vc has an extension to P0214 with the member function to_bitset and static member function from_bitset.