bab2min / EigenRand

Fastest Random Distribution Generator for Eigen
https://bab2min.github.io/eigenrand/
MIT License
92 stars 12 forks source link

Add AVX512 support #49

Closed m-henning-toptal closed 1 year ago

m-henning-toptal commented 1 year ago

This passes the tests here on zen4 with clang and -march=x86-64-v4 and shows some performance improvements over a build without avx512 support (-march=x86-64-v3)

Runs of EigenRand-benchmark: benchmark-baseline.txt benchmark-avx512.txt

bab2min commented 1 year ago

Hello @m-henning-toptal , thank you for your contribution. It seems good to me. I'd like to add an automatic test for AVX512 to your PR, could you turn on the 'Allow edits from maintainers'?

m-henning-toptal commented 1 year ago

Sure! I checked 'Allow edits by maintainers'.

bab2min commented 1 year ago

@m-henning-toptal They cause compile error in some environments including Windows & Linux, can you check this?

GCC 7.5.0 /home/runner/work/EigenRand/EigenRand/./EigenRand/arch/AVX512/PacketFilter.h:56:45: error: cannot convert ‘__vector(8) double’ to ‘__m512 {aka __vector(16) float}’ for argument ‘2’ to ‘__m512 _mm512_permutexvar_ps(__m512i, __m512)’ __m512 rot_rest = _mm512_permutexvar_ps(rotate, _rest);

MSVC D:\a\EigenRand\EigenRand\EigenRand\arch/AVX512/MorePacketMath.h(177,1): error C2664: '__m512 _mm512_castsi512_ps(__m512i)': cannot convert argument 1 from 'const Eigen::internal::Packet16f' to '__m512i' [D:\a\EigenRand\EigenRand\build\test\EigenRand-test.vcxproj]

You can see the detail error messages at the above action logs.

m-henning-toptal commented 1 year ago

-mavx512f actually isn't enough for this as written - it requires -mavx512dq, which I've added to the linux workflow. Eigen itself will probably also complain that it wants -mfma, but I'm not totally sure how to add that to the test configuration.

Anyway, if you add -mfma, I think the gcc tests should pass now - I've fixed stuff locally. I don't have msvc set up here, so not sure if I've fixed all the issues there yet.

m-henning-toptal commented 1 year ago

Looks like all the tests pass on CI at this point.

bab2min commented 1 year ago

@m-henning-toptal Yes, it seems to work well. I'll merge it. Thank you again for your contributing!