Closed lwelzel closed 12 months ago
Hi @lwelzel, thanks for your interest in VIP and for raising this issue - together with its solution. I have now included your suggested fix for pytorch-related SVD calculation options in the master branch.
I also like your suggestion of offering a GPU-based option for other image operations, as the speedup gain can indeed be significant. I don't have a handy access to GPUs at the moment, nor much time at least in the next few weeks to look into that unfortunately. If this is something you'd need for your projects, you're very welcome to contribute. I'm sure this would also be useful for other members of the VIP community.
If you're interested and have the time, the way to proceed would be to add a new imlib
option and an associated 'if' block in the routines you mention, which could then contain a wrapper of an operation done e.g. with kornia. We can discuss more details in a quick telecon if needed, or we can iterate directly on any potential pull request. I'm closing this issue, but feel free to open a new thread in Github Discussion (Ideas section) to discuss this further if you wish.
When using the
psfsub.pca
with thesvd_mode="pytorch"
option and forcing all PyTorch computation on GPU withdevice = torch.device('cuda')
raises an error since the tensors are still on the GPU whenpsfsub.svd
tries to convert them to np.arrays in here. This issue is also relevant fro the other pytorch options;'eigenpytorch', 'randpytorch'
.Current Behavior
Using
svd_mode="pytorch"
raises this error: Error:TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
Possible Solution
This can be fixed by copying the tensor to RAM here.:
and similarly for the other PyTorch methods.
Steps to Reproduce
Notes:
First of, thanks for the great package and for the option to do stuff on GPU. Secondly, are you considering implementing the option to do other operations like re-scaling or rotating on GPU as well? The pytorch ecosystem offers some relatively straight forward options to do this, e.g. kornia. This seems like the major time bottle neck for e.g. PCA at the moment. For the mock data in the reproduction above I get a speedup of ~20x with a naive implementation (bi-linear interpolation).