Although we have tested and seen green lights on extensive platforms including mainstream CPUs & GPUs on windows/linux, we still keep seeing users reporting failures on particular processors & environment. Many of them are beyond our resources. clBLAS team is a very small group. So we always encourage users to debug their failures. Except the gtest framework already integrated in clBLAS which hard coded various configurations (by configurations, I mean, matrix size, transpose option, etc), we have actually provided the "client" tool to test the BLAS routines' performance.
Users can specify whatever configurations through command lines.
One thing was missing that users can not verify/validate the correctness through client.
So, starting from today, users can easily do it if they are unsure of the result of clBLAS gemm & trmm at any size, transpose options on Linux.
Because there is no easy solution of building/linking Netlib CBLAS on windows currently, we disable it on windows.
Changes:
(1)update readme: Netlib BLAS is preferred over ACML as the CPU reference BLAS. Instruction of geting Netlib on Ubuntu/Windows is given.
(2)now users can validate the correctness of gemm & trmm through client by adding a command "-v 1" (v is verify/validate, -v 0 or nothing will disable the validation). To allow (2), You must have installed with Netlib as discussed in (1).
(3) Change the default order as column-major in client. Before, users complain low performance of gemm NN because they were actually testing gemm TT as they were using row-major.
Although we have tested and seen green lights on extensive platforms including mainstream CPUs & GPUs on windows/linux, we still keep seeing users reporting failures on particular processors & environment. Many of them are beyond our resources. clBLAS team is a very small group. So we always encourage users to debug their failures. Except the gtest framework already integrated in clBLAS which hard coded various configurations (by configurations, I mean, matrix size, transpose option, etc), we have actually provided the "client" tool to test the BLAS routines' performance.
Users can specify whatever configurations through command lines. One thing was missing that users can not verify/validate the correctness through client. So, starting from today, users can easily do it if they are unsure of the result of clBLAS gemm & trmm at any size, transpose options on Linux.
Because there is no easy solution of building/linking Netlib CBLAS on windows currently, we disable it on windows.
Changes:
More BLAS routines validations will follow.
This change is