WojciechMula / toys

Storage for my snippets, toy programs, etc.
BSD 2-Clause "Simplified" License
316 stars 38 forks source link

Help needed for AVX2/512 popcount #21

Open jianshu93 opened 7 months ago

jianshu93 commented 7 months ago

Dear @WojciechMula,

In some code of our software, we use GNU popcount like this:

// Start of macros and method copied from https://github.com/kimwalisch/libpopcnt

ifdef GNUC

#define GNUC_PREREQ(x, y) \
        (__GNUC__ > x || (__GNUC__ == x && __GNUC_MINOR__ >= y))

else

#define GNUC_PREREQ(x, y) 0

endif

ifndef __has_builtin

#define __has_builtin(x) 0

endif

/*

// End of macros and method copied from https://github.com/kimwalisch/libpopcnt

... prepare bits code here, which is uint64_t ...

if GNUC_PREREQ(4, 2) || __has_builtin(__builtin_popcountll)

    samebits += __builtin_popcountll(bits);

else

    samebits += popcount64(bits)

endif

We want to use instruction specific popcount, e.g., SSE,AVX2 or AVX512. Do we just use sse_count_byte_popcount and avx2_count_byte_popcount(), but for AVX512 there is not such a _popcount() function to use. Not an expert in SIMD at all but just so surprised at how SIMD can accelerate popcount.

Many thanks,

Jianshu

bitRAKE commented 6 months ago

Here is a beautiful cheatsheet ... https://www.officedaytime.com/simd512e/ (Search "pop" and click through links for detailed info!)