lebedov / scikit-cuda

Python interface to GPU-powered libraries
http://scikit-cuda.readthedocs.org/
Other
985 stars 178 forks source link

Is the linalg.eig really slower than np.eigh? #242

Closed ifelismino closed 6 years ago

ifelismino commented 6 years ago

Hi!

i'm using the cusolver version of the linalg.eig (as I have no access to CULA), and find that the eigenvalue decomposition is way slower than the cpu one. Is it just me or is it really the case so far?

Thanks, Ian

lebedov commented 6 years ago

How big is your matrix?

ifelismino commented 6 years ago

I did a speed test with 100 5000x5000 matrices. I'm using a GTX1070

import pycuda.gpuarray as gpuarray
import pycuda.autoinit
import numpy as np
import ctypes
from skcuda import linalg
linalg.init()
import time

#Create Matrices
Matrices = []
for i in np.arange(0,100,1):
    Matrices.append(np.array(np.random.rand(5000,5000), np.float32, order='F'))

tgi = time.time()
#GPU RUN
for m in Matrices:
    m_gpu = gpuarray.to_gpu(m)
    vr_gpu, w_gpu = linalg.eig(m_gpu, 'N', 'V', lib='cusolver')
print('GPU time total:')
print(str(time.time() - tgi))

tci = time.time()
#CPU RUN

for m in Matrices:
    w, v = np.linalg.eigh(m)
print('CPU time total:')
print(str(time.time() - tci))

And this is the result. GPU time total: 4196.450253248215 CPU time total: 3447.799717903137

lebedov commented 6 years ago

As noted in the docstring, the cusolver backend can only deal with symmetric matrices; the answer you are computing in this case is presumably wrong. I suspect the slowness has to do with the fact the input isn't symmetric. I'm not aware of any free implementations of generalized eigenvalue decomposition that operate entirely on the GPU (MAGMA's implementation uses a hybrid CPU/GPU approach).