Closed zixian-wu closed 1 month ago
Could you check whether the training and test images are correctly loaded?
Thanks for your reply! I referred to https://github.com/ykdai/BasicPBC/issues/3, I was training with a single GPU and found that num_gpu in options/train/basicpbc_pbch_train_option.yml is 2. I changed the value to 1. Now it works fine!
the error message is reported as follows 2024-08-15 15:31:11,013 INFO: Model [PBCModel] is created. 2024-08-15 15:31:11,082 INFO: Start training from epoch: 0, iter: 0 Traceback (most recent call last): File "/workspace/BasicPBC/basicsr/train.py", line 223, in
train_pipeline(root_path)
File "/workspace/BasicPBC/basicsr/train.py", line 161, in train_pipeline
train_data = prefetcher.next()
File "/workspace/BasicPBC/basicsr/data/prefetch_dataloader.py", line 76, in next
return next(self.loader)
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in next
data = self._next_data()
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
return self._process_data(data)
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
data.reraise()
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/_utils.py", line 706, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
return self.collate_fn(data)
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 317, in default_collate
return collate(batch, collate_fn_map=default_collate_fn_map)
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 155, in collate
clone.update({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 155, in
clone.update({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 142, in collate
return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map)
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 223, in collate_numpy_array_fn
return collate([torch.as_tensor(b) for b in batch], collate_fn_map=collate_fn_map)
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 142, in collate
return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map)
File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 213, in collate_tensorfn
out = elem.new(storage).resize(len(batch), *list(elem.size()))
RuntimeError: Trying to resize storage that is not resizable