训练自己的数据集，跌带到中间epoch出现错误

yuxin7 commented 2 years ago

---- [Epoch 12/100] ---- +------------------+--------------------+--------------------+---------------------+---------------------+ | Step: 4563/38900 | loss | reg_loss | conf_loss | cls_loss | +------------------+--------------------+--------------------+---------------------+---------------------+ | YoloLayer1 | 0.5429285764694214 | 0.3865797519683838 | 0.11499390006065369 | 0.04135490208864212 | | YoloLayer2 | 0.7612576484680176 | 0.4948746860027313 | 0.16640125215053558 | 0.09998173266649246 | | YoloLayer3 | 1.0451996326446533 | 0.6731263399124146 | 0.20551152527332306 | 0.16656182706356049 | +------------------+--------------------+--------------------+---------------------+---------------------+ Total Loss: 2.349386, Runtime: 6990.246672 C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [25,0,0] Assertion input_val >= zero && input_val <= one failed. C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [26,0,0] Assertion input_val >= zero && input_val <= one failed. C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [27,0,0] Assertion input_val >= zero && input_val <= one failed. C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [28,0,0] Assertion input_val >= zero && input_val <= one failed. C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [29,0,0] Assertion input_val >= zero && input_val <= one failed. C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [30,0,0] Assertion input_val >= zero && input_val <= one failed. C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [31,0,0] Assertion input_val >= zero && input_val <= one failed. Traceback (most recent call last): File "D:/WorkSpace/PythonWorkSpace/R-YOLOv4/train.py", line 174, in t.train() File "D:/WorkSpace/PythonWorkSpace/R-YOLOv4/train.py", line 151, in train outputs, loss = self.model(imgs, targets) File "D:\Software\Anconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "D:\WorkSpace\PythonWorkSpace\R-YOLOv4\model\yolo.py", line 35, in forward y1, loss1 = self.yolo1(x2, target) File "D:\Software\Anconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(input, **kwargs) File "D:\WorkSpace\PythonWorkSpace\R-YOLOv4\model\yololayer.py", line 202, in forward cls_loss += F.binary_cross_entropy(pred_cls[obj_mask], tcls[obj_mask], reduction=self.reduction) RuntimeError: CUDA error: device-side assert triggered

yingkunwu commented 2 years ago

你好，我可以看一下你的資料集嗎？給幾個範例就可以了，謝謝！

yuxin7 commented 2 years ago

这是我用的数据。麻烦您看一下，谢谢您的回复。您也可以通过2252685386这个qq号加我的微信。

------------------ 原始邮件 ------------------ 发件人: "kunnnnethan/R-YOLOv4" @.>; 发送时间: 2022年4月2日(星期六) 晚上6:04 @.>; @.**@.>; 主题: Re: [kunnnnethan/R-YOLOv4] 训练自己的数据集，跌带到中间epoch出现错误 (Issue #22)

你好，我可以看一下你的資料集嗎？給幾個範例就可以了，謝謝！

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

yingkunwu commented 2 years ago

这是我用的数据。麻烦您看一下，谢谢您的回复。您也可以通过2252685386这个qq号加我的微信。 …

我好像看不到你說的數據是在哪裡～你是用什麼方式呈現的呢？

yuxin7 commented 2 years ago

附件里有

---原始邮件--- 发件人: @.> 发送时间: 2022年4月3日(周日) 中午1:46 收件人: @.>; 抄送: @.**@.>; 主题: Re: [kunnnnethan/R-YOLOv4] 训练自己的数据集，跌带到中间epoch出现错误 (Issue #22)

这是我用的数据。麻烦您看一下，谢谢您的回复。您也可以通过2252685386这个qq号加我的微信。 …

我好像看不到你說的數據是在哪裡～你是用什麼方式呈現的呢？

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>