Closed cbn3 closed 1 year ago
CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
File "/root/autodl-tmp/project/deformable-detr-batchformer/util/box_ops.py", line 60, in generalized_box_iou
assert (boxes1[:, 2:] >= boxes1[:, :2]).all()
File "/root/autodl-tmp/project/deformable-detr-batchformer/models/matcher.py", line 87, in forward
cost_giou = -generalized_box_iou(box_cxcywh_to_xyxy(out_bbox),
File "/root/autodl-tmp/project/deformable-detr-batchformer/models/deformable_detr.py", line 342, in forward
indices = self.matcher(outputs_without_aux, targets)
File "/root/autodl-tmp/project/deformable-detr-batchformer/engine.py", line 45, in train_one_epoch
loss_dict = criterion(outputs, targets)
File "/root/autodl-tmp/project/deformable-detr-batchformer/main.py", line 282, in main
train_stats = train_one_epoch(
File "/root/autodl-tmp/project/deformable-detr-batchformer/main.py", line 334, in
I thought you will suffer from this issue without adding batchformerv2? Besides, have you tried it with a single GPU?
我直接用了您的deformable detr batchformer的代码 我自己的deformable detr是可以正常运行的。我就是用单gpu运行的main.py
I thought you will suffer from this issue without adding batchformerv2? Besides, have you tried it with a single GPU?
我用batchformerv2里面的deformable detr代码也可以正常运行
Could you provide you running scripts (hyper-parameters)?
Actually, the revised part compared to original Deformable-DETR is mainly as follows,
作者你好 我在把类别数量+1后不再报那个错误了 感谢您的及时回复! 在deformable detr源代码中中我把models下面的deformable detr文件中的类别设置为10可以正常运行,但是在batchformerv2版本中需要把类别数量设置到11才可以正常运行,这是为什么呢?我本身数据集类别数量就是10。
Possibly, the dataset has a class label larger than 10. But it can not explain why the original deformable detr does not suffer from this issue.
CUDA error: device-side assert triggered: it is usually because the index is larger than the length of the tensor. for example, a[10] while a is an array with length 10.
我也觉得奇怪。你可以把标签打印出来看看。我的方式其实只是混合了特征,并没有修改任何标签。是不是有可能这个模块跑失败了,网络跑崩了?你打印一下assert (boxes1[:, 2:] >= boxes1[:, :2]).all() 这个里面 boxes的数值看看,是不是不满足这个断言。
好的 谢谢作者
CUDA error: device-side assert triggered File "/root/autodl-tmp/project/deformable-detr-batchformer/models/matcher.py", line 81, in forward cost_class = pos_cost_class[:, tgt_ids] - neg_cost_class[:, tgt_ids] File "/root/autodl-tmp/project/deformable-detr-batchformer/models/deformable_detr.py", line 342, in forward indices = self.matcher(outputs_without_aux, targets) File "/root/autodl-tmp/project/deformable-detr-batchformer/engine.py", line 45, in train_one_epoch loss_dict = criterion(outputs, targets) File "/root/autodl-tmp/project/deformable-detr-batchformer/main.py", line 282, in main train_stats = train_one_epoch( File "/root/autodl-tmp/project/deformable-detr-batchformer/main.py", line 334, in
main(args)
请问以上报错是什么原因 该怎样解决?