Closed jason7708 closed 1 month ago
thanks @jason7708, one addition which I was planning to add is to use avx512 mask load when available which is not affected by the page cross boundary when mask is not exceeding it, but likely not a common scenario that one would have avx512 but not bmi2 :thinking:
details