CNugteren / CLBlast

Tuned OpenCL BLAS
Apache License 2.0
1.06k stars 202 forks source link

when i tune GEMM kernel in clblast, i encountered l2 error #545

Closed HirataYurina closed 4 months ago

HirataYurina commented 4 months ago

Firstly, i use m=1024 n=1024 k=3072 to tune SGEMM kernel, i get a best configuration. But, when i use m=2048 n=2048 k=2048 to tune SGEMM kernel, i find the best configuration that i got by m=1024 n=1024 k=3072 encounters l2 error. So, can i use the best configuration that i got by m=1024 n=1024 k=3072 to be the default parameter in xgemm_32.hpp? I afraid this configuration may cause some computing errors in other matrix shapes like m=2048 n=2048 k=2048.

CNugteren commented 4 months ago

In theory yes, but, the safest would be to actually run a correctness test in that case. Or choose one of the other tuning parameters: typically a lot of them are close to the optimal value.

HirataYurina commented 4 months ago

Thanks, bro.