aim-uofa / AdelaiDet

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
https://git.io/AdelaiDet
Other
3.38k stars 650 forks source link

ABCnet: CUDA error: device-side assert triggered #387

Closed Pxtri2156 closed 3 years ago

Pxtri2156 commented 3 years ago

Hi, every. I try to train ABCnet on Vietnamese datasets. But I found the error:

File "tools/train_net.py", line 221, in args=(args,), File "/home/tiennv/anaconda3_v2/lib/python3.7/site-packages/detectron2/engine/launch.py", line 82, in launch main_func(args) File "tools/train_net.py", line 209, in main /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [14,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [16,0,0] Assertion t >= 0 && t < n_classes failed. return trainer.train() File "tools/train_net.py", line 90, in train self.train_loop(self.start_iter, self.max_iter) File "tools/train_net.py", line 79, in train_loop self.run_step() File "/home/tiennv/anaconda3_v2/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 495, in run_step self._trainer.run_step() File "/home/tiennv/anaconda3_v2/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 273, in run_step loss_dict = self.model(data) File "/home/tiennv/anaconda3_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/mlcv/WorkingSpace/SceneText/tripx/AdelaiDet/adet/modeling/one_stagedetector.py", line 123, in forward , detector_losses = self.roi_heads(images, features, proposals, gt_instances) File "/home/tiennv/anaconda3_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "/mlcv/WorkingSpace/SceneText/tripx/AdelaiDet/adet/modeling/roi_heads/text_head.py", line 163, in forward preds, rec_loss = self.recognizer(bezier_features, targets) File "/home/tiennv/anaconda3_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/mlcv/WorkingSpace/SceneText/tripx/AdelaiDet/adet/modeling/roi_heads/attn_predictor.py", line 130, in forward decoder_input, decoder_hidden, rois) File "/home/tiennv/anaconda3_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "/mlcv/WorkingSpace/SceneText/tripx/AdelaiDet/adet/modeling/roi_heads/attn_predictor.py", line 79, in forward attn_weights = self.vat(torch.tanh(alpha)) # (T n, 1) RuntimeError: CUDA error: device-side assert triggered

And the size of the VietNamese dictionary is 230. So I changed _C.MODEL.BATEXT.VOC_SIZE = 230. But haven't solved this problem. Can you help me to solve this?

Pxtri2156 commented 3 years ago

I just fix this problem. I must increse size of dict by one. It is mean: _C.MODEL.BATEXT.VOC_SIZE = 230 + 1