NagabhushanSN95 / ViP-NeRF

Official code release accompanying the paper - "ViP-NeRF: Visibility Prior for Sparse Input Neural Radiance Fields"
MIT License
67 stars 7 forks source link

Testing using multiple GPUs doesn't work #6

Open HarshaMupparaju opened 1 year ago

HarshaMupparaju commented 1 year ago

The code was using 2 GPUs for testing, when I ran the code as is, I got this error:

(ViP_NeRF_GPU) [kapilc@eceaiws src]$ python NerfLlffTrainerTester01.py 
Program started at 01/09/2023 12:14:41 PM
Loading visibility prior mask: ../data/databases/NeRF_LLFF/data/all/visibility_prior/VW02/fern/visibility_masks/0006_0013.png
Loading visibility prior mask: ../data/databases/NeRF_LLFF/data/all/visibility_prior/VW02/fern/visibility_masks/0013_0006.png
Training 11/fern begins...
Resuming Training from iteration 50001
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50000/50000 [00:00<?, ?it/s]
Loaded Model in train0011/fern/Model_Iter050000 trained for 50000 iterations
fern:   0%|                                                                                                                                  | 0/5 [00:00<?, ?it/s]
module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu
Traceback (most recent call last):
  File "NerfLlffTrainerTester01.py", line 993, in <module>
    main()
  File "NerfLlffTrainerTester01.py", line 980, in main
    demo1a()
  File "NerfLlffTrainerTester01.py", line 348, in demo1a
    start_testing(test_configs)
  File "NerfLlffTrainerTester01.py", line 101, in start_testing
    Tester.start_testing(test_configs, scenes_data, save_depth=True, save_depth_var=True, save_visibility=True)
  File "/home/kapilc/HARSHA/ViP-NeRF/src/Tester01.py", line 210, in start_testing
    predictions = tester.predict_frame(tgt_pose, view_tgt_pose, secondary_poses,
  File "/home/kapilc/HARSHA/ViP-NeRF/src/Tester01.py", line 63, in predict_frame
    output_dict = self.model(input_dict, sec_views_vis=secondary_poses is not None)
  File "/home/kapilc/Softwares/Anaconda/anaconda3/envs/ViP_NeRF_GPU/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/kapilc/Softwares/Anaconda/anaconda3/envs/ViP_NeRF_GPU/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 154, in forward
    raise RuntimeError("module must have its parameters and buffers "
RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu
Program ended at 01/09/2023 12:14:45 PM
Execution time: 0:00:04.746777

But when I ran the testing with 1 GPU only by changing 'device': [0, 1] to 'device': [0], it works and I get test results. Another thing I noticed, the training works with 2 GPUs, but not the testing.