issues
search
google
/
XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Other
1.89k
stars
376
forks
source link
F32-GEMM avx512 fix remainder handling when nc > 16
#7415
Closed
copybara-service[bot]
closed
3 weeks ago
copybara-service[bot]
commented
3 weeks ago
F32-GEMM avx512 fix remainder handling when nc > 16
create a mask for each vector and use unconditional mask stores
F32-GEMM avx512 fix remainder handling when nc > 16