JeanKossaifi / tensorly-notebooks

Tensor methods in Python with TensorLy
424 stars 126 forks source link

Tensor Decomposition Using GPU #14

Open VoliCrank opened 3 years ago

VoliCrank commented 3 years ago

This might be a stupid question but I couldn't find a solution anywhere.

When I use gpu to run non-negative decompositions for a random tensor, it is much slower than using a cpu (for various sizes). For reference it takes 0.4 seconds on cpu while it takes more than 10 seconds on gpu to run a single decomposition (size 3x2x2, but the same holds for 100 x 100 x 1000). I have pytorch and cuda 11.1 as well as cudnn on my computer and my gpu is rtx 3070 so it should theoretically beat my cpu?

JeanKossaifi commented 2 years ago

Did you check that the computation is indeed done on GPU? It also depends on the size (and rank) of your tensors. It is possible that the NN version is not optimized enough -- feel free to open a PR on the main tensorly repo if you identify any bottleneck. You can compare with the regular decomposition and see if that is much faster.