ykdai / BasicPBC

Official Implementation of "Learning Inclusion Matching for Animation Paint Bucket Colorization"
Other
233 stars 23 forks source link

training with the train/PaintBucket_Char, encountered some errors #28

Closed zixian-wu closed 1 month ago

zixian-wu commented 1 month ago

the error message is reported as follows 2024-08-15 15:31:11,013 INFO: Model [PBCModel] is created. 2024-08-15 15:31:11,082 INFO: Start training from epoch: 0, iter: 0 Traceback (most recent call last): File "/workspace/BasicPBC/basicsr/train.py", line 223, in train_pipeline(root_path) File "/workspace/BasicPBC/basicsr/train.py", line 161, in train_pipeline train_data = prefetcher.next() File "/workspace/BasicPBC/basicsr/data/prefetch_dataloader.py", line 76, in next return next(self.loader) File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in next data = self._next_data() File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data return self._process_data(data) File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data data.reraise() File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/_utils.py", line 706, in reraise raise exception RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop data = fetcher.fetch(index) # type: ignore[possibly-undefined] File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch return self.collate_fn(data) File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 317, in default_collate return collate(batch, collate_fn_map=default_collate_fn_map) File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 155, in collate clone.update({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem}) File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 155, in clone.update({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem}) File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 142, in collate return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map) File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 223, in collate_numpy_array_fn return collate([torch.as_tensor(b) for b in batch], collate_fn_map=collate_fn_map) File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 142, in collate return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map) File "/opt/miniconda/envs/basicpbc/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 213, in collate_tensorfn out = elem.new(storage).resize(len(batch), *list(elem.size())) RuntimeError: Trying to resize storage that is not resizable

ykdai commented 1 month ago

Could you check whether the training and test images are correctly loaded?

zixian-wu commented 1 month ago

Thanks for your reply! I referred to https://github.com/ykdai/BasicPBC/issues/3, I was training with a single GPU and found that num_gpu in options/train/basicpbc_pbch_train_option.yml is 2. I changed the value to 1. Now it works fine!