Great work on DeepSnake! When training the network, I often run into the following error:
Traceback (most recent call last):
File "/home/ethan/Documents/CircleSnake/train_net.py", line 56, in <module>
main()
File "/home/ethan/Documents/CircleSnake/train_net.py", line 51, in main
train(cfg, network)
File "/home/ethan/Documents/CircleSnake/train_net.py", line 26, in train
trainer.train(epoch, train_loader, optimizer, recorder)
File "/home/ethan/Documents/CircleSnake/lib/train/trainers/trainer.py", line 33, in train
for iteration, batch in enumerate(data_loader):
File "/home/ethan/miniconda3/envs/snake/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
data = self._next_data()
File "/home/ethan/miniconda3/envs/snake/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/home/ethan/miniconda3/envs/snake/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/home/ethan/miniconda3/envs/snake/lib/python3.7/site-packages/torch/_utils.py", line 434, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 2.
Original Traceback (most recent call last):
File "/home/ethan/miniconda3/envs/snake/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/ethan/miniconda3/envs/snake/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
return self.collate_fn(data)
File "/home/ethan/Documents/CircleSnake/lib/datasets/collate_batch.py", line 8, in circle_snake_collator
meta = default_collate([b['meta'] for b in batch])
File "/home/ethan/miniconda3/envs/snake/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/home/ethan/miniconda3/envs/snake/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 74, in <dictcomp>
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/home/ethan/miniconda3/envs/snake/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 64, in default_collate
return default_collate([torch.as_tensor(b) for b in batch])
File "/home/ethan/miniconda3/envs/snake/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 54, in default_collate
storage = elem.storage()._new_shared(numel)
File "/home/ethan/miniconda3/envs/snake/lib/python3.7/site-packages/torch/storage.py", line 155, in _new_shared
return cls._new_using_filename(size)
RuntimeError: falseINTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1639180594101/work/aten/src/ATen/MapAllocator.cpp":263, please report a bug to PyTorch. unable to open shared memory object </torch_30948_1> in read-write mode
Great work on DeepSnake! When training the network, I often run into the following error:
Do you have any suggestions?