ge-xing / Diff-UNet

Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation. (using diffusion for 3D medical image segmentation)
Apache License 2.0
141 stars 20 forks source link

Cannot execute test.py with cuda. #14

Open ToruHironaka opened 1 year ago

ToruHironaka commented 1 year ago

I always get an error below when I set device=device = "cuda:0"

File "test.py", line 170, in v_mean, v_out = trainer.validation_single_gpu(val_dataset=test_ds)

Diff-UNet/BTCV/light_training/trainer.py", line 168, in validation_single_gpu val_out = self.validation_step(batch) File "test.py", line 112, in validation_step output = self.window_infer(image, self.model, pred_type="ddim_sample") File "/opt/monai/monai/inferers/inferer.py", line 521, in call return sliding_window_inference( File "/opt/monai/monai/inferers/utils.py", line 256, in sliding_window_inference seg_prob_out = predictor(win_data, *args, *kwargs) # batched patch File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, **kwargs) File "test.py", line 72, in forward sample_return += sample.cpu() RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

ToruHironaka commented 1 year ago

This problem was caused by torch version. My torch version was 2.0.0a0+1767026. This version of torch was able to execute train.py but test.py. I created python virtual environment and installed torch version 2.0.1+cu117 and numpy version 1.19.5. These package installation allow me to run test.py.

ready2drop commented 11 months ago

@ToruHironaka @920232796 I'm having the same problem, but I'm having the same error even if I change the version as you said. Is there a another solution?

The library version is below. numpy 1.23.0 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 torch 2.0.1

GoEung commented 9 months ago

I solved this problem by upgrading torch version and downgrading monai and numpy.