Open GongYiLiao opened 7 years ago
@GongYiLiao can you show the out put of af.info() ?
In [91]: af.info()
ArrayFire v3.3.2 (OpenCL, 64-bit Linux, build default)
[0] AMD : Tahiti, 3035 MB -- OpenCL 1.2 AMD-APP (1912.5) -- Device driver 1912.5 (VM) -- FP64 Support: True -- Unified Memory (False)
-1- AMD : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz, 31865 MB -- OpenCL 1.2 AMD-APP (1912.5) -- Device driver 1912.5 (sse2,avx) -- FP64 Support: True -- Unified Memory (True)
Hello,
I can confirm that svd is abnormally slow under python-arrayfire as compared to scipy. I have benchmarked it using the attached file test_svd_af_gul.py, similar to bench_fft.py. Results are presented in arrayfire-test_svd_gul.txt, showing a 10 TIMES SPEED DECREASE with cuda backend as compared to cpu backend or scipy. Moreover opencl backend is not working, despite I can run bench_blas with it, for instance. What is strange is that GPU is almost not used with cuda backend.
I choose scipy rather than numpy because there is a bug with svd in single precision (https://github.com/numpy/numpy/issues/9516).
Thanking you, GuL916
I found AF's SVD implementation is quite slow comparing to DGEMM with Radeon HD 7950/FGLRX driver on Debian Jessie:
AF's SVD takes more than 9 times of Numpy's SVD to solve the same matrix, However, the in DGEMM, AF is faster (but not much) than Numpy:
I am wondering if there are anything I should tune/adjust before proceeding.