Closed IncubatorShokuhou closed 2 years ago
loss.py
里没有import torch.nn.functional as F
.然后这个修完了之后还有问题(数据集是coco128):Traceback (most recent call last): File "/mnt/data/yolov5_research/train.py", line 740, in <module> main(opt) File "/mnt/data/yolov5_research/train.py", line 637, in main train(opt.hyp, opt, device, callbacks) File "/mnt/data/yolov5_research/train.py", line 421, in train loss, loss_items = compute_loss_ota(pred, targets.to(device), imgs) if aux_ota_loss else compute_loss(pred, targets.to(device)) # loss scaled by batch_siz File "/mnt/data/yolov5_research/utils/loss.py", line 1002, in __call__ tobj[b, a, gj, gi] = (1.0 - self.gr) + self.gr * iou.detach().clamp(0).type(tobj.dtype) # iou ratio RuntimeError: shape mismatch: value tensor of shape [290, 1] cannot be broadcast to indexing result of shape [290]
有点仓促这是由于IOU的张量维度问题 不过我完善了一下逻辑可以兼容v5 v7的训练,对于Ota-loss,需要加上ota-match,对于aux-ota,需要加上 aux_ota_loss,嗯 感谢你的反馈 我会尽快修复 另外i这段代码导致P6辅助头训练也是存在问题的 作者刚刚修改也是,计算修复还有个代码的 bug明天看看
我想应该改好了 是新的v5一些写法导致的 和旧版v5基础上集成的v7 有冲突 我测试了可以了应该 现在基本完美兼容了v7 并且v5的代码优化更好 所以代码风格还是以实时版本v5为 准 感谢你的反馈 我也后面会训练验证
@positive666 谢谢大佬。能跑了。但是还有一些问题:
yolov7-e6e.yaml
的话,会出现:
RuntimeError: Given groups=1, weight of size [80, 3, 3, 3], expected input[1, 12, 128, 128] to have 3 channels, but got 12 channels instead
from apex import amp
)WARNING: NMS time limit 1.060s exceeded
=================分割线====================
问题2的amp问题:更换为最新版本的yolov5的check_amp
就ok了@positive666 谢谢大佬。能跑了。但是还有一些问题:
- 直接用原版的
yolov7-e6e.yaml
的话,会出现:RuntimeError: Given groups=1, weight of size [80, 3, 3, 3], expected input[1, 12, 128, 128] to have 3 channels, but got 12 channels instead
- 提示AMP无法使用。我已确定apex安装正常(
from apex import amp
)- 训练时出现:
WARNING: NMS time limit 1.060s exceeded
=================分割线==================== 问题2的amp问题:更换为最新版本的yolov5的check_amp
就ok了 q1.你是训练嘛,能否提供下完整的命令 q2.已更新 q3.这个是时间限制:
@positive666 谢谢大佬。能跑了。但是还有一些问题:
- 直接用原版的
yolov7-e6e.yaml
的话,会出现:RuntimeError: Given groups=1, weight of size [80, 3, 3, 3], expected input[1, 12, 128, 128] to have 3 channels, but got 12 channels instead
- 提示AMP无法使用。我已确定apex安装正常(
from apex import amp
)- 训练时出现:
WARNING: NMS time limit 1.060s exceeded
=================分割线==================== 问题2的amp问题:更换为最新版本的yolov5的check_amp
就ok了 q1.你是训练嘛,能否提供下完整的命令 q2.已更新 q3.这个是时间限制:
python train.py --cfg models/v7_cfg/training/yolov7e6e_原版.yaml --imgsz 640 --weights 'yolov7_training_weights/yolov7-e6e_training.pt' --data data/我自己的数据集.yaml --aux_ota_loss --hyp data/hyps/hyp.scratch-v7-p6.yaml --device 0 --batch-size 16 --epoch 1000 --multi-scale --cos-lr
1.你这个错误我还没复现,用的V7的原始yaml,但是这个项目里我删除了Reorg 2.多尺度训练现在有BUG
Reorg
我把Reorg重新加进去common.py里了。然后就报了这个错。不加的话直接就没法运行,提示缺少Reorg。
Reorg
我把Reorg重新加进去common.py里了。然后就报了这个错。不加的话直接就没法运行,提示缺少Reorg。 yolo.py elif m is ReOrg: c2 = ch[f] * 4
在yolo.py中加上 输出通道数的设置就可以了
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
loss.py
里没有import torch.nn.functional as F
.然后这个修完了之后还有问题(数据集是coco128):Traceback (most recent call last): File "/mnt/data/yolov5_research/train.py", line 740, in <module> main(opt) File "/mnt/data/yolov5_research/train.py", line 637, in main train(opt.hyp, opt, device, callbacks) File "/mnt/data/yolov5_research/train.py", line 421, in train loss, loss_items = compute_loss_ota(pred, targets.to(device), imgs) if aux_ota_loss else compute_loss(pred, targets.to(device)) # loss scaled by batch_siz File "/mnt/data/yolov5_research/utils/loss.py", line 1002, in __call__ tobj[b, a, gj, gi] = (1.0 - self.gr) + self.gr * iou.detach().clamp(0).type(tobj.dtype) # iou ratio RuntimeError: shape mismatch: value tensor of shape [290, 1] cannot be broadcast to indexing result of shape [290]
这个问题是修改了代码哪个地方啊
loss.py
里没有import torch.nn.functional as F
.然后这个修完了之后还有问题(数据集是coco128):