jackyjsy / CVPR21Chal-SLR

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.
Creative Commons Zero v1.0 Universal
205 stars 50 forks source link

RuntimeError: DataLoader worker (pid(s) 49) exited unexpectedly when trying to test config files in Conv3D #5

Closed snorlaxse closed 3 years ago

snorlaxse commented 3 years ago

Greetings. Congratulations on your work. I was trying to run "python Sign_Isolated_Conv3D_clip_test.py" inside /Conv3D/ folder. I ran into the following error, I'm pasting the error traceback.

---Traceback Starts---- Using 2 GPUs ######################Testing Started####################### ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). Traceback (most recent call last): File "/home/smilelab_slr/SLR_pytorch/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 872, in _try_get_data data = self._data_queue.get(timeout=timeout) File "/home/smilelab_slr/SLR_pytorch/env/lib/python3.8/queue.py", line 179, in get self.not_empty.wait(remaining) File "/home/smilelab_slr/SLR_pytorch/env/lib/python3.8/threading.py", line 306, in wait gotit = waiter.acquire(True, timeout) File "/home/smilelab_slr/SLR_pytorch/env/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 49) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "Sign_Isolated_Conv3D_clip_test.py", line 137, in val_loss = val_epoch(model, criterion, val_loader, device, 0, logger, writer, phase=phase, exp_name=exp_name) File "/home/smilelab_slr/Isolated_SLR/CVPR21Chal-SLR/Conv3D/validation_clip.py", line 13, in val_epoch for batch_idx, data in enumerate(dataloader): File "/home/smilelab_slr/SLR_pytorch/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next data = self._next_data() File "/home/smilelab_slr/SLR_pytorch/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1068, in _next_data idx, data = self._get_data() File "/home/smilelab_slr/SLR_pytorch/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1024, in _get_data success, data = self._try_get_data() File "/home/smilelab_slr/SLR_pytorch/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 885, in _try_get_data raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e RuntimeError: DataLoader worker (pid(s) 49) exited unexpectedly

----Traceback Ends----

Any insights will be appreciated. Thanks.

snorlaxse commented 3 years ago

set docker shm-size="12g"