Open MShinkle opened 3 weeks ago
With any cuda-based code, this is usually accomplished by setting the
CUDA_VISIBLE_DEVICES
environment variable before running the python
script. See
https://stackoverflow.com/questions/39649102/how-do-i-select-which-gpu-to-run-a-job-on
On Thu, Aug 22, 2024 at 10:06 AM MShinkle @.***> wrote:
Currently, specifying 'torch_cuda' as the backend appears to select the first CUDA device visible to PyTorch (cuda:0). However, in multi-gpu systems, it would be useful to specify a specific CUDA device through something like: set_backend("torch_cuda:3") which would tell Himalaya to use CUDA device 3. set_backend("torch_cuda") would still function equivalently to how it currently does.
Are there any interest or plans for this feature? From glancing through the Himalaya PyTorch backend I don't think implementing this would be too involved, but I could be mistaken.
— Reply to this email directly, view it on GitHub https://github.com/gallantlab/himalaya/issues/66, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABO5TGVGBSIAZ56QZPMCACLZSYK75AVCNFSM6AAAAABM6SAIJ6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ4DCMZRHE4DAOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Matteo Visconti di Oleggio Castello, Ph.D. Postdoctoral Scholar Helen Wills Neuroscience Institute, UC Berkeley MatteoVisconti.com http://matteovisconti.com || github.com/mvdoc || linkedin.com/in/matteovisconti
PyTorch generally recommends against setting cuda device via this method, in favor of device('cuda:3')
or device('cuda', index=3)
syntax. Though I think the benefits of the latter (better support for using multiple CUDA devices within the same process) are unlikely to impact 99% of use cases, so maybe not worth the time to incorporate.
Thanks!
I think that adding this option to himalaya will complicate the code unnecessarily. But what happens if you manually push the features and data to the gpu that you want to use, and then those tensors are passed to the himalaya solvers? I wonder whether if this can be a workaround to avoid using the environment variable.
That's an interesting idea, my expectation is that it would convert it to whatever cuda:0 is in the current environment, but I'll give it a test.
Looks like backend.asarray moves tensors to cuda:0 regardless of the original device of the tensor. For example:
import torch
from himalaya.ridge import Ridge
from himalaya.backend import set_backend, get_backend
set_backend('torch_cuda')
backend = get_backend()
print(backend.asarray(torch.zeros(10, device='cuda:1')).device)
Prints cuda:0
Currently, specifying 'torch_cuda' as the backend appears to select the first CUDA device visible to PyTorch (cuda:0). However, in multi-gpu systems, it would be useful to specify a specific CUDA device through something like:
set_backend("torch_cuda:3")
which would tell Himalaya to use CUDA device 3.set_backend("torch_cuda")
would still function equivalently to how it currently does.Are there any interest or plans for this feature? From glancing through the Himalaya PyTorch backend I don't think implementing this would be too involved, but I could be mistaken.