meituan / YOLOv6

YOLOv6: a single-stage object detection framework dedicated to industrial applications.
GNU General Public License v3.0
5.72k stars 1.04k forks source link

Problem while changing image size input at training (--img 300) #467

Closed jorgeili15 closed 2 years ago

jorgeili15 commented 2 years ago

I keep getting this error when I try to change the default image size for training the model (640) at train.py:


ERROR in training steps. ERROR in training loop or eval/save model. Traceback (most recent call last): File "tools/train.py", line 126, in main(args) File "tools/train.py", line 116, in main trainer.train() File "C:\Users\a2588\Desktop\facepig\YOLO-faces\YOLOv6\yolov6\core\engine.py", line 99, in train self.train_in_loop(self.epoch) File "C:\Users\a2588\Desktop\facepig\YOLO-faces\YOLOv6\yolov6\core\engine.py", line 113, in train_in_loop self.train_in_steps(epoch_num) File "C:\Users\a2588\Desktop\facepig\YOLO-faces\YOLOv6\yolov6\core\engine.py", line 134, in train_in_steps preds, s_featmaps = self.model(images) File "C:\Users\a2588\Desktop\facepig\YOLO-faces\YOLOv6\yolo-venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "C:\Users\a2588\Desktop\facepig\YOLO-faces\YOLOv6\yolov6\models\yolo.py", line 40, in forward x = self.neck(x) File "C:\Users\a2588\Desktop\facepig\YOLO-faces\YOLOv6\yolo-venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(input, **kwargs) File "C:\Users\a2588\Desktop\facepig\YOLO-faces\YOLOv6\yolov6\models\reppan.py", line 108, in forward f_concat_layer0 = torch.cat([upsample_feat0, x1], 1) RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 20 but got size 19 for tensor number 1 in the list.

Does anyone know what can it be related to?

Chilicyy commented 2 years ago

hi @jorgeili15 maybe you can try 320 for image-size. It is better to scale the image-size with 32x.

jorgeili15 commented 2 years ago

Hi @Chilicyy, yeah that worked! Thank you so much. I'm working with 300x300 images that's why I chose that format. Why is that it only work with 64x or 32x? Would be useful to know it in case I want to try with other formats.

Chilicyy commented 2 years ago

@jorgeili15 There are some upsample and downsample operations with 2x strides, and the dimensionality of output featuremaps is related to them.