Closed tairenpiao closed 6 months ago
If set matrix dimension N smaller than 512(e.g. 256, 128) when running benchmark, the result of the xnor_gemm has error.
Yeah, it was really just a proof of concept cuda kernel. Not really optimized or useful for production.
If set matrix dimension N smaller than 512(e.g. 256, 128) when running benchmark, the result of the xnor_gemm has error.