Not using GPU - Githubissues

soichih commented 6 years ago

I was able to run TractSeg successfully, but it didn't use our GPU available on our machine.

Here is the output

Creating brain mask...
Creating peaks (1 of 3)...
dwi2response: [ERROR] Output file './tractseg_output/response.txt' already exists (use -force to override)
Creating peaks (2 of 3)...
dwi2fod: [ERROR] output file "./tractseg_output/WM_FODs.mif" already exists (use -force option to force overwrite)
dwi2fod: [ERROR] error creating image "./tractseg_output/WM_FODs.mif"
Creating peaks (3 of 3)...
sh2peaks: [ERROR] output file "./tractseg_output/peaks.nii.gz" already exists (use -force option to force overwrite)
sh2peaks: [ERROR] error creating image "./tractseg_output/peaks.nii.gz"
Loading weights from: /.tractseg/pretrained_weights_tract_segmentation_v1.npz
Processing direction (1 of 3)
100%|########################################################################################| 144/144 [03:35<00:00,  1.49s/it]
Processing direction (2 of 3)
100%|########################################################################################| 144/144 [03:32<00:00,  1.47s/it]
Processing direction (3 of 3)
100%|########################################################################################| 144/144 [03:32<00:00,  1.47s/it]

real    11m40.197s
user    13m58.256s
sys 0m29.791s

I am running it through singularity with nvidia/cuda:9.0-cudnn7-runtime-ubuntu16.04 container, but other App that we run in similar manner can use our GPU just fine. I believe something needs to be reconfiguref for TractSeg for it to detect/use the GPU. How can I troubleshoot this?

wasserth commented 6 years ago

TractSeg is using the following code to determine if a GPU is available:

import torch
torch.device("cuda" if torch.cuda.is_available() else "cpu")

You can run this code in your environment (using pytorch > 0.4) and it should return "device(type='cuda')". If it returns "device(type='cpu')" torch does not find the cuda installation. So far I did not try to run TractSeg with GPU from a docker container. Without docker it always works to detect the GPU. With docker I have no experience so far. But I am also working on making the official docker image CUDA compatible.

soichih commented 6 years ago

Thanks I see.. Yes, I am seeing "cpu" when I run it via singularity.

I've tried to set some LD_LIBRARY_PATH for cuda lib / nvidia drivers, but I ran into the issue described here > https://github.com/pytorch/pytorch/issues/4101

I need to play with it more to see how I can set the right PATHs inside the container. I know it can do it as our other container works fine. I think something is wrong with how pytorch tries to find those libraries.

soichih commented 6 years ago

I was missing a few nvidia/cuda related libraries inside my container. It's working now.

MIC-DKFZ / TractSeg

Not using GPU #4