THU-MIG / yolov10

YOLOv10: Real-Time End-to-End Object Detection
https://arxiv.org/abs/2405.14458
GNU Affero General Public License v3.0
8.52k stars 742 forks source link

RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling cublasSgemv(handle, op, m, n, &alpha, a, lda, x, incx, &beta, y, incy) #209

Open super-song-sir opened 1 month ago

super-song-sir commented 1 month ago

当我使用命令 model = YOLOv10('yolov10n.yaml')选定模型后,训练yolov10n.yaml时,进行完一个epoch周期训练再进行验证时会报错CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling cublasSgemv(handle, op, m, n, &alpha, a, lda, x, incx, &beta, y, incy)。并伴随有警告UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').index = index // nc。但是当我训练yolov10x.yaml时会产生警告,但是并没有出现报错。请问这是什么原因。怎么修复呢?

leonnil commented 1 month ago

您好!请问您能给我一些更具体的信息吗(例如,pytorch版本,CUDA版本以及操作系统版本),以及请核查是否正确设置LD_LIBRARY_PATH,谢谢!

Michealpeng commented 1 month ago

def bbox_decode(self, anchor_points, pred_dist): """Decode predicted object bounding box coordinates from anchor points and distribution.""" if self.use_dfl: b, a, c = pred_dist.shape # batch, anchors, channels

pred_dist = pred_dist.view(b, a, 4, c // 4).softmax(3).matmul(self.proj.type(pred_dist.dtype))

        # pred_dist = pred_dist.view(b, a, c // 4, 4).transpose(2,3).softmax(3).matmul(self.proj.type(pred_dist.dtype))
         pred_dist = (pred_dist.view(b, a, c // 4, 4).softmax(2) * self.proj.type(pred_dist.dtype).view(1, 1, -1, 1)).sum(2)
    return dist2bbox(pred_dist, anchor_points, xywh=False)

loss.py种切换下pred_dist就可以了