Open mratsim opened 5 years ago
Minimal example for reproduction
import ../src/arraymancer
let n1 = [[2, 4, 3, 1, 3, 1, 3, 1],
[1, 2, 1, 1, 2, 0, 4, 3],
[2, 0, 0, 3, 0, 4, 4, 1],
[1, 1, 4, 0, 3, 1, 3, 0],
[3, 4, 1, 1, 4, 2, 3, 4],
[2, 4, 0, 2, 3, 3, 3, 4],
[3, 0, 0, 3, 1, 4, 3, 1],
[4, 3, 2, 4, 1, 0, 0, 0]].toTensor()
let n2 = [[2, 2, 0, 4, 0, 0, 4, 2],
[2, 0, 0, 1, 1, 1, 3, 1],
[0, 2, 2, 0, 2, 2, 3, 3],
[0, 0, 1, 0, 4, 2, 4, 1],
[0, 0, 1, 3, 4, 2, 4, 2],
[4, 3, 4, 1, 4, 4, 0, 3],
[3, 3, 0, 2, 1, 2, 3, 3],
[2, 1, 2, 1, 2, 4, 4, 1]].toTensor()
let n1n2 = [[27,23,16,29,35,32,58,37],
[24,19,11,23,26,30,49,27],
[34,29,21,21,34,34,36,32],
[17,22,15,21,28,25,40,33],
[39,27,23,40,45,46,72,41],
[41,26,25,34,47,48,65,38],
[33,28,22,26,37,34,41,33],
[14,12, 9,22,27,17,51,23]].toTensor()
let
fn1 = n1.astype(float)
fn2 = n2.astype(float)
echo fn1
echo fn2
doAssert fn1 * fn2 == n1n2.astype(float)
Compile with
nim c -r -d:blas=cblas build/f64_gemm.nim
Using MKL instead works and also solved all the reported failures. The commandline is a bit tricky
nim c -r -d:blas=mkl_intel_lp64 --clibdir:"/opt/intel/mkl/lib/intel64" --dynlibOverride:"mkl_intel_lp64" --passl:"/opt/intel/mkl/lib/intel64/libmkl_intel_lp64.a -lmkl_core -lmkl_gnu_thread -lgomp" build/f64_gemm.nim
And for the tests
nim c -r -d:blas=mkl_intel_lp64 --clibdir:"/opt/intel/mkl/lib/intel64" --dynlibOverride:"mkl_intel_lp64" --passl:"/opt/intel/mkl/lib/intel64/libmkl_intel_lp64.a -lmkl_core -lmkl_gnu_thread -lgomp" tests/nn_primitives/test_nnp_convolution.nim
nim c -r -d:blas=mkl_intel_lp64 --clibdir:"/opt/intel/mkl/lib/intel64" --dynlibOverride:"mkl_intel_lp64" --passl:"/opt/intel/mkl/lib/intel64/libmkl_intel_lp64.a -lmkl_core -lmkl_gnu_thread -lgomp" tests/autograd/test_gate_shapeshifting.nim
The shapeshifting test seems to be unrelated but it actually requires a matrix multiplication for verification: https://github.com/mratsim/Arraymancer/blob/bde79d2f73b71ece719526a7b39f03bb100784b0/tests/autograd/test_gate_shapeshifting.nim#L145-L162
float32 works as well
Further investigation show that:
So Arch packaging by separating blas and cblas and using openblas with netlib cblas header instead of openblas cblas header caused an issue.
Numpy is also impacted if build with OpenBLAS
import numpy as np
n1 = np.array(
[[2, 4, 3, 1, 3, 1, 3, 1],
[1, 2, 1, 1, 2, 0, 4, 3],
[2, 0, 0, 3, 0, 4, 4, 1],
[1, 1, 4, 0, 3, 1, 3, 0],
[3, 4, 1, 1, 4, 2, 3, 4],
[2, 4, 0, 2, 3, 3, 3, 4],
[3, 0, 0, 3, 1, 4, 3, 1],
[4, 3, 2, 4, 1, 0, 0, 0]],
dtype=np.float64)
n2 = np.array(
[[2, 2, 0, 4, 0, 0, 4, 2],
[2, 0, 0, 1, 1, 1, 3, 1],
[0, 2, 2, 0, 2, 2, 3, 3],
[0, 0, 1, 0, 4, 2, 4, 1],
[0, 0, 1, 3, 4, 2, 4, 2],
[4, 3, 4, 1, 4, 4, 0, 3],
[3, 3, 0, 2, 1, 2, 3, 3],
[2, 1, 2, 1, 2, 4, 4, 1]],
dtype=np.float64)
n1n2 = np.array(
[[27,23,16,29,35,32,58,37],
[24,19,11,23,26,30,49,27],
[34,29,21,21,34,34,36,32],
[17,22,15,21,28,25,40,33],
[39,27,23,40,45,46,72,41],
[41,26,25,34,47,48,65,38],
[33,28,22,26,37,34,41,33],
[14,12, 9,22,27,17,51,23]],
dtype=np.float64)
print(n1)
print(n2)
print(n1 @ n2)
np.testing.assert_array_equal(n1 @ n2, n1n2)
Edited: OpenMP -> Archlinux, I never run the full test suite on Arch due to the impossibility to do "nimble test -d:blas=cblas" as contrary to Debian/Ubuntu and Travis, the BLAS symbol are in libcblas.so and not libblas.so.
A couple of tests are failing