All-in-One Development Tool based on PaddlePaddle(飞桨低代码开发工具)
Apache License 2.0
4.91k
stars
958
forks
source link
训练报错, OSError: (External) CUDA error(719), unspecified launch failure.[Hint: 'cudaErrorLaunchFailure'. An exception occurred on the device while executing a kernel. #1604
W0922 09:51:58.184947 12832 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 11.2
W0922 09:51:58.191946 12832 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
num_epochs222 12
Daochumoxingmulu train pretrained_dir F:/AI/1233_ResNet50_vd_2022-09-22_09_51_57
pretrained_dir osp.join output/faster_rcnn_r50_fpn
2022-09-22 09:51:58 [INFO] Loading pretrained model from E:/Intsoft_Pretrain2/FasterRCNN_ResNet50_fpn_IMAGENET\ResNet50_cos_pretrained.pdparams
2022-09-22 09:51:58 [WARNING] neck.fpn_inner_res2_sum_lateral.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_inner_res2_sum_lateral.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_res2_sum.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_res2_sum.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_inner_res3_sum_lateral.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_inner_res3_sum_lateral.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_res3_sum.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_res3_sum.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_inner_res4_sum_lateral.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_inner_res4_sum_lateral.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_res4_sum.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_res4_sum.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_inner_res5_sum.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_inner_res5_sum.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_res5_sum.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] neck.fpn_res5_sum.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] rpn_head.rpn_feat.rpn_conv.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] rpn_head.rpn_feat.rpn_conv.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] rpn_head.rpn_rois_score.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] rpn_head.rpn_rois_score.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] rpn_head.rpn_rois_delta.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] rpn_head.rpn_rois_delta.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] bbox_head.head.fc6.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] bbox_head.head.fc6.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] bbox_head.head.fc7.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] bbox_head.head.fc7.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] bbox_head.bbox_score.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] bbox_head.bbox_score.bias is not in pretrained model
2022-09-22 09:51:58 [WARNING] bbox_head.bbox_delta.weight is not in pretrained model
2022-09-22 09:51:58 [WARNING] bbox_head.bbox_delta.bias is not in pretrained model
2022-09-22 09:51:58 [INFO] There are 265/295 variables loaded into FasterRCNN.
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1110529498]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1109329307]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1157890495]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1112468260]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1111433916]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1113625662]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1108166455]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1130415248]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1114690607]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1106957141]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1118707754]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1109324224]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1110389007]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1146318115]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1119027899]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1111728838]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1117886148]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1116087952]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1104819815]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1115156369]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1125361397]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1113285200]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1112926126]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1112495624]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1133989623]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1131700321]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1109823092]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1112164588]
Error: ../paddle/phi/kernels/funcs/scatter.cu.h:66 Assertion `scatter_i >= 0` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be greater than or equal to 0, but received [-1115624544]
Exception in thread paddle_train_MB:
Traceback (most recent call last):
File "D:\Anaconda3\envs\PaddleDabao11237\lib\threading.py", line 926, in _bootstrap_inner
self.run()
File "D:\Anaconda3\envs\PaddleDabao11237\lib\threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "f:\intsoft_AI_08\IntsoftAI.py", line 1740, in paddle_train_MB
use_vdl=True)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\cv\models\detector.py", line 1388, in train
early_stop_patience, use_vdl, resume_checkpoint)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\cv\models\detector.py", line 339, in train
use_vdl=use_vdl)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\cv\models\base.py", line 343, in train_loop
outputs = self.run(self.net, data, mode='train')
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\cv\models\detector.py", line 105, in run
net_out = net(inputs)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\ppdet\modeling\architectures\meta_arch.py", line 59, in forward
out = self.get_loss()
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\ppdet\modeling\architectures\faster_rcnn.py", line 95, in get_loss
rpn_loss, bbox_loss = self._forward()
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\ppdet\modeling\architectures\faster_rcnn.py", line 76, in _forward
rois, rois_num, rpn_loss = self.rpn_head(body_feats, self.inputs)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\ppdet\modeling\proposal_generator\rpn_head.py", line 135, in forward
loss = self.get_loss(scores, deltas, anchors, inputs)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\ppdet\modeling\proposal_generator\rpn_head.py", line 224, in get_loss
anchors)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\ppdet\modeling\proposal_generator\target_layer.py", line 92, in __call__
assign_on_cpu=self.assign_on_cpu)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\ppdet\modeling\proposal_generator\target.py", line 41, in rpn_anchor_target
ignore_thresh, is_crowd_i, assign_on_cpu)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\ppdet\modeling\proposal_generator\target.py", line 83, in label_box
iou = bbox_overlaps(gt_boxes, anchors)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\ppdet\modeling\bbox_utils.py", line 129, in bbox_overlaps
area1 = bbox_area(boxes1)
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddlex\ppdet\modeling\bbox_utils.py", line 111, in bbox_area
return (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])
File "D:\Anaconda3\envs\PaddleDabao11237\lib\site-packages\paddle\fluid\dygraph\varbase_patch_methods.py", line 740, in __getitem__
return self._getitem_index_not_tensor(item)
OSError: (External) CUDA error(719), unspecified launch failure.
[Hint: 'cudaErrorLaunchFailure'. An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointerand accessing out of bounds shared memory. Less common cases can be system specific - more information about these cases canbe found in the system specific user guide. This leaves the process in an inconsistent state and any further CUDA work willreturn the same error. To continue using CUDA, the process must be terminated and relaunched.] (at ..\paddle\phi\backends\gpu\cuda\cuda_info.cc:258)
[operator < slice > error]
自己用线程 开启动训练 分类模型 没问题 yolo模型没问题 FasterRCNN 报错 奇怪
环境
请提供您使用的PaddlePaddle和PaddleX的版本号 paddlepaddle-gpu 2.3.2.post112 paddlex 2.1.0
请提供您使用的操作系统信息,如Linux/Windows/MacOS Windows
请问您使用的Python版本是? Python 3.7.13
请问您使用的CUDA/cuDNN的版本号是? cuda11.6 cudnn 8.20