lufficc / SSD

High quality, fast, modular reference implementation of SSD in PyTorch
MIT License
1.52k stars 384 forks source link

ERROR: Unexpected segmentation fault encountered in worker. -training mydatasets #185

Open zcf2020DPL opened 3 years ago

zcf2020DPL commented 3 years ago

ERROR: Unexpected segmentation fault encountered in worker. Traceback (most recent call last): File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 724, in _try_get_data data = self._data_queue.get(timeout=timeout) File "/opt/conda/lib/python3.6/queue.py", line 173, in get self.not_empty.wait(remaining) File "/opt/conda/lib/python3.6/threading.py", line 299, in wait gotit = waiter.acquire(True, timeout) File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 19790) is killed by signal: Segmentation fault.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 114, in main() File "train.py", line 105, in main model = train(cfg, args) File "train.py", line 44, in train model = do_train(cfg, model, train_loader, optimizer, scheduler, checkpointer, device, arguments, args) File "/home/ps/project/SSD/ssd/engine/trainer.py", line 76, in dotrain for iteration, (images, targets, ) in enumerate(data_loader, start_iter): File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 804, in next idx, data = self._get_data() File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 761, in _get_data success, data = self._try_get_data() File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 737, in _try_get_data raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) RuntimeError: DataLoader worker (pid(s) 19790) exited unexpectedly root@ps:/home/ps/project/SSD#

mydatasets is crowdhuman datasets tans to voc... no error when train voc or coco..

is it the problem of bouding of box is out of image , 0-width or 1-width.. thanks.......................

lufficc commented 3 years ago

Yes, the bbox of crowdhuman is out of image...

zcf2020DPL commented 3 years ago

Yes, the bbox of crowdhuman is out of image... I limited the label in the picture to 1-width,1-height, still exit the above problems,may some other rensons to take the problem???!!! this problem has bother me for two weeks.