Closed kelindar closed 3 years ago
This PR implements an unrolled POPCNT loop to speed things up by around 30%. Note that bits.CountOnes() already uses POPCNT instruction if present, but unrolling that doesn't work very well (I tried).
POPCNT
30%
bits.CountOnes()
This PR implements an unrolled
POPCNT
loop to speed things up by around30%
. Note thatbits.CountOnes()
already usesPOPCNT
instruction if present, but unrolling that doesn't work very well (I tried).