WongKinYiu / yolor

implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)
GNU General Public License v3.0
1.99k stars 518 forks source link

遇到了一个RuntimeError,如何解决?我使用的是非COCO数据集 #16

Open Wanghe1997 opened 3 years ago

Wanghe1997 commented 3 years ago

Traceback (most recent call last): File "train.py", line 540, in train(hyp, opt, device, tb_writer, wandb) File "train.py", line 287, in train pred = model(imgs) # forward File "D:\Software\Anaconda3\envs\YOLOR\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "G:\datasets_models\obj_models\yolor\models\models.py", line 543, in forward return self.forward_once(x) File "G:\datasets_models\obj_models\yolor\models\models.py", line 594, in forward_once x = module(x, out) # WeightedFeatureFusion(), FeatureConcat() File "D:\Software\Anaconda3\envs\YOLOR\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(input, *kwargs) File "G:\datasets_models\obj_models\yolor\utils\layers.py", line 404, in forward return a.expand_as(x) x RuntimeError: The expanded size of the tensor (39) must match the existing size (255) at non-singleton dimension 1. Target sizes: [1, 39, 160, 160]. Tensor sizes: [1, 255, 1, 1]

Wanghe1997 commented 3 years ago

可以分析下这个错误吗?谢谢

Wanghe1997 commented 3 years ago

跳转到最后一行错误提示的代码: def forward(self, x, outputs): a = outputs[self.layers[0]] return a.expand_as(x) * x

WongKinYiu commented 3 years ago

https://github.com/WongKinYiu/yolor/blob/main/cfg/yolor_p6.cfg#L1567-L1581 要跟著改成39

Wanghe1997 commented 3 years ago

https://github.com/WongKinYiu/yolor/blob/main/cfg/yolor_p6.cfg#L1571-L1581 要跟著改成39

207到#210都改为39,可以运行了,谢谢

Wanghe1997 commented 3 years ago

还想再问下作者您,您在论文中不同的模型实验数据表格,每种模型是训练了多少epoch?

WongKinYiu commented 3 years ago

https://github.com/WongKinYiu/yolor#training

Wanghe1997 commented 3 years ago

作者您好,每次训练到最后一个epoch都会出现这样的错误,您知道怎么解决吗? Epoch gpu_mem box obj cls total targets img_size 299/299 19.8G 0.0144 0.007881 0.003497 0.02578 23 1280: 100%|██████████| 1734/1734 [12:41<00:00, 2.28it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 0%| | 0/97 [00:00<?, ?it/s] Traceback (most recent call last): File "G:/wanghe/yolor/train.py", line 570, in train(hyp, opt, device, tb_writer, wandb) File "G:/wanghe/yolor/train.py", line 336, in train results, maps, times = test.test(opt.data, File "G:\wanghe\yolor\test.py", line 226, in test plot_images(img, output_to_target(output, width, height), paths, f, names) # predictions File "G:\wanghe\yolor\utils\plots.py", line 108, in output_to_target return np.array(targets) File "E:\ProgramData\Anaconda3\envs\wanghe\lib\site-packages\torch\tensor.py", line 621, in array return self.numpy() TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Process finished with exit code 1

然后跳转到最后一个错误所在处,代码如下: def array(self, dtype=None): if has_torch_function_unary(self): return handle_torch_function(Tensor.array, (self,), self, dtype=dtype) if dtype is None: return self.numpy() <--- Line 621 else: return self.numpy().astype(dtype, copy=False)

WongKinYiu commented 3 years ago

感覺是環境和版本問題. 把 return np.array(targets) 改成 return np.array(targets.cpu()) 看看.

JonathanSamelson commented 3 years ago

Hi, I'm also using a non-COCO dataset with 11 classes. I tried changing multiple things in the configuration but I can't make it work.

Could you please report all changes that needs to be done in order to train with different number of classes?

JonathanSamelson commented 3 years ago

For the record, I eventually could make it work with the following config for yolor w6: This is a combination of what I wrote previously.

``` [net] batch=64 subdivisions=32 width=640 height=640 channels=3 momentum=0.949 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1 learning_rate=0.00261 burn_in=1000 max_batches = 100000 policy=steps steps=800000,900000 scales=.1,.1 mosaic=1 # ============ Backbone ============ # # Stem # P1 # Downsample # 0 [reorg] [convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=silu # P2 # Downsample [convolutional] batch_normalize=1 filters=128 size=3 stride=2 pad=1 activation=silu # Split [convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=silu [route] layers = -2 [convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=silu # Residual Block [convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear # Transition first # #[convolutional] #batch_normalize=1 #filters=64 #size=1 #stride=1 #pad=1 #activation=silu # Merge [-1, -(3k+3)] [route] layers = -1,-12 # Transition last # 16 (previous+6+3k) [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu # P3 # Downsample [convolutional] batch_normalize=1 filters=256 size=3 stride=2 pad=1 activation=silu # Split [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [route] layers = -2 [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu # Residual Block [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear # Transition first # #[convolutional] #batch_normalize=1 #filters=128 #size=1 #stride=1 #pad=1 #activation=silu # Merge [-1, -(3k+3)] [route] layers = -1,-24 # Transition last # 43 (previous+6+3k) [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu # P4 # Downsample [convolutional] batch_normalize=1 filters=512 size=3 stride=2 pad=1 activation=silu # Split [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [route] layers = -2 [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu # Residual Block [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear # Transition first # #[convolutional] #batch_normalize=1 #filters=256 #size=1 #stride=1 #pad=1 #activation=silu # Merge [-1, -(3k+3)] [route] layers = -1,-24 # Transition last # 70 (previous+6+3k) [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu # P5 # Downsample [convolutional] batch_normalize=1 filters=768 size=3 stride=2 pad=1 activation=silu # Split [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [route] layers = -2 [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu # Residual Block [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=384 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=384 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=384 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear # Transition first # #[convolutional] #batch_normalize=1 #filters=384 #size=1 #stride=1 #pad=1 #activation=silu # Merge [-1, -(3k+3)] [route] layers = -1,-12 # Transition last # 85 (previous+6+3k) [convolutional] batch_normalize=1 filters=768 size=1 stride=1 pad=1 activation=silu # P6 # Downsample [convolutional] batch_normalize=1 filters=1024 size=3 stride=2 pad=1 activation=silu # Split [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu [route] layers = -2 [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu # Residual Block [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=silu [shortcut] from=-3 activation=linear # Transition first # #[convolutional] #batch_normalize=1 #filters=512 #size=1 #stride=1 #pad=1 #activation=silu # Merge [-1, -(3k+3)] [route] layers = -1,-12 # Transition last # 100 (previous+6+3k) [convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=silu # ============ End of Backbone ============ # # ============ Neck ============ # # CSPSPP [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu [route] layers = -2 [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=512 activation=silu [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu ### SPP ### [maxpool] stride=1 size=5 [route] layers=-2 [maxpool] stride=1 size=9 [route] layers=-4 [maxpool] stride=1 size=13 [route] layers=-1,-3,-5,-6 ### End SPP ### [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=512 activation=silu [route] layers = -1, -13 # 115 (previous+6+5+2k) [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu # End of CSPSPP # FPN-5 [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [upsample] stride=2 [route] layers = 85 [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [route] layers = -1, -3 [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu # Split [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [route] layers = -2 # Plain Block [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=384 activation=silu [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=384 activation=silu [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=384 activation=silu # Merge [-1, -(2k+2)] [route] layers = -1, -8 # Transition last # 131 (previous+6+4+2k) [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu # FPN-4 [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [upsample] stride=2 [route] layers = 70 [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [route] layers = -1, -3 [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu # Split [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [route] layers = -2 # Plain Block [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=256 activation=silu [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=256 activation=silu [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=256 activation=silu # Merge [-1, -(2k+2)] [route] layers = -1, -8 # Transition last # 147 (previous+6+4+2k) [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu # FPN-3 [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [upsample] stride=2 [route] layers = 43 [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [route] layers = -1, -3 [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu # Split [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [route] layers = -2 # Plain Block [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=128 activation=silu [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=128 activation=silu [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=128 activation=silu # Merge [-1, -(2k+2)] [route] layers = -1, -8 # Transition last # 163 (previous+6+4+2k) [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=silu # PAN-4 [convolutional] batch_normalize=1 size=3 stride=2 pad=1 filters=256 activation=silu [route] layers = -1, 147 [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu # Split [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [route] layers = -2 # Plain Block [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=256 activation=silu [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=256 activation=silu [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=256 activation=silu [route] layers = -1,-8 # Transition last # 176 (previous+3+4+2k) [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=silu # PAN-5 [convolutional] batch_normalize=1 size=3 stride=2 pad=1 filters=384 activation=silu [route] layers = -1, 131 [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu # Split [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [route] layers = -2 # Plain Block [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=384 activation=silu [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=384 activation=silu [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=384 activation=silu [route] layers = -1,-8 # Transition last # 189 (previous+3+4+2k) [convolutional] batch_normalize=1 filters=384 size=1 stride=1 pad=1 activation=silu # PAN-6 [convolutional] batch_normalize=1 size=3 stride=2 pad=1 filters=512 activation=silu [route] layers = -1, 115 [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu # Split [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu [route] layers = -2 # Plain Block [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=512 activation=silu [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=512 activation=silu [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=512 activation=silu [route] layers = -1,-8 # Transition last # 202 (previous+3+4+2k) [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=silu # ============ End of Neck ============ # # 203 [implicit_add] filters=256 # 204 [implicit_add] filters=512 # 205 [implicit_add] filters=768 # 206 [implicit_add] filters=1024 # 207 [implicit_mul] filters=48 # 208 [implicit_mul] filters=48 # 209 [implicit_mul] filters=48 # 210 [implicit_mul] filters=48 # ============ Head ============ # # YOLO-3 [route] layers = 163 [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=256 activation=silu [shift_channels] from=203 [convolutional] size=1 stride=1 pad=1 filters=48 activation=linear [control_channels] from=207 [yolo] mask = 0,1,2 anchors = 19,27, 44,40, 38,94, 96,68, 86,152, 180,137, 140,301, 303,264, 238,542, 436,615, 739,380, 925,792 classes=11 num=12 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=1 scale_x_y = 1.05 iou_thresh=0.213 cls_normalizer=1.0 iou_normalizer=0.07 iou_loss=ciou nms_kind=greedynms beta_nms=0.6 # YOLO-4 [route] layers = 176 [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=512 activation=silu [shift_channels] from=204 [convolutional] size=1 stride=1 pad=1 filters=48 activation=linear [control_channels] from=208 [yolo] mask = 3,4,5 anchors = 19,27, 44,40, 38,94, 96,68, 86,152, 180,137, 140,301, 303,264, 238,542, 436,615, 739,380, 925,792 classes=11 num=12 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=1 scale_x_y = 1.05 iou_thresh=0.213 cls_normalizer=1.0 iou_normalizer=0.07 iou_loss=ciou nms_kind=greedynms beta_nms=0.6 # YOLO-5 [route] layers = 189 [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=768 activation=silu [shift_channels] from=205 [convolutional] size=1 stride=1 pad=1 filters=48 activation=linear [control_channels] from=209 [yolo] mask = 6,7,8 anchors = 19,27, 44,40, 38,94, 96,68, 86,152, 180,137, 140,301, 303,264, 238,542, 436,615, 739,380, 925,792 classes=11 num=12 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=1 scale_x_y = 1.05 iou_thresh=0.213 cls_normalizer=1.0 iou_normalizer=0.07 iou_loss=ciou nms_kind=greedynms beta_nms=0.6 # YOLO-6 [route] layers = 202 [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=silu [shift_channels] from=206 [convolutional] size=1 stride=1 pad=1 filters=48 activation=linear [control_channels] from=210 [yolo] mask = 9,10,11 anchors = 19,27, 44,40, 38,94, 96,68, 86,152, 180,137, 140,301, 303,264, 238,542, 436,615, 739,380, 925,792 classes=11 num=12 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=1 scale_x_y = 1.05 iou_thresh=0.213 cls_normalizer=1.0 iou_normalizer=0.07 iou_loss=ciou nms_kind=greedynms beta_nms=0.6 # ============ End of Head ============ # ```
FXY0117 commented 3 years ago

感覺是環境和版本問題. 把 return np.array(targets) 改成 return np.array(targets.cpu()) 看看.

Excuse me, I tried on your way, but another question occurs: AttributeError: 'list' object has no attribute 'cpu'

wiekern commented 3 years ago

If you read the code snippet in models/models.py as below

# p.view(bs, 255, 13, 13) -- > (bs, 3, 13, 13, 85)  # (bs, anchors, grid, grid, classes + xywh)
p = p.view(bs, self.na, self.no, self.ny, self.nx).permute(0, 1, 3, 4, 2).contiguous()  # prediction

You will find out that a tensor in size (bs, 255, 13, 13) is converted to one in (bs, 3, 13, 13, 85) # (bs, anchors, grid, grid, classes + xywh), where 255 can be decomposed as 3 (number of anchors) 85 (80 classes + 5), so the keypoint is to modify 255 to a desired number. In my case, I am doing single class task, therefore 18 (3 anchors (1 class + 5)) is fit. Then I have some modifications in cfg file, e.g. searching keyword "by_wiekern", 8 modifications in total,


[implicit_add]
filters=384

# 205
[implicit_add]
filters=512

# 206
[implicit_add]
filters=640

# 207
[implicit_mul]
#filters=255 by_wiekern
filters=18

# 208
[implicit_mul]
#filters=255 by_wiekern
filters=18

# 209
[implicit_mul]
#filters=255 by_wiekern
filters=18

# 210
[implicit_mul]
#filters=255 by_wiekern
filters=18

# ============ Head ============ #

# YOLO-3

[route]
layers = 163

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=silu

[shift_channels]
from=203

[convolutional]
size=1
stride=1
pad=1
#filters=255 by_wiekern
filters=18
activation=linear

[control_channels]
from=207

[yolo]
mask = 0,1,2
anchors = 19,27,  44,40,  38,94,  96,68,  86,152,  180,137,  140,301,  303,264,  238,542,  436,615,  739,380,  925,792
classes=1
num=12
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1
scale_x_y = 1.05
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6

# YOLO-4

[route]
layers = 176

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=384
activation=silu

[shift_channels]
from=204

[convolutional]
size=1
stride=1
pad=1
#filters=255 by_wiekern
filters=18
activation=linear

[control_channels]
from=208

[yolo]
mask = 3,4,5
anchors = 19,27,  44,40,  38,94,  96,68,  86,152,  180,137,  140,301,  303,264,  238,542,  436,615,  739,380,  925,792
classes=1
num=12
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1
scale_x_y = 1.05
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6

# YOLO-5

[route]
layers = 189

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=silu

[shift_channels]
from=205

[convolutional]
size=1
stride=1
pad=1
#filters=255 by_wiekern
filters=18
activation=linear

[control_channels]
from=209

[yolo]
mask = 6,7,8
anchors = 19,27,  44,40,  38,94,  96,68,  86,152,  180,137,  140,301,  303,264,  238,542,  436,615,  739,380,  925,792
classes=1
num=12
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1
scale_x_y = 1.05
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6

# YOLO-6

[route]
layers = 202

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=640
activation=silu

[shift_channels]
from=206

[convolutional]
size=1
stride=1
pad=1
#filters=255 by_wiekern
filters=18
activation=linear

[control_channels]
from=210

[yolo]
mask = 9,10,11
anchors = 19,27,  44,40,  38,94,  96,68,  86,152,  180,137,  140,301,  303,264,  238,542,  436,615,  739,380,  925,792
classes=1
num=12
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1
scale_x_y = 1.05
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6

# ============ End of Head ============ #```

@JonathanSamelson who also guides you making it.
athulvingt commented 3 years ago

anchors = 19,27, 44,40, 38,94, 96,68, 86,152, 180,137, 140,301, 303,264, 238,542, 436,615, 739,380, 925,792

Your image size is 640, but your anchors are for image size 1280 or some image size close 1000. YoloR finds the anchor boxes automatically, still are you able to get proper detection when you initialize anchors with this value