Why the yolox_tiny can not load the pretrain model correctly？

qunyuanchen commented 3 years ago

When i used this repo on MegStudio and tried to train yolox_tiny with the pretrained model, an error occurred. The detail log are as follow.

2021-09-15 13:11:11 | INFO | yolox.core.trainer:247 - loading checkpoint for fine tuning 2021-09-15 13:11:11 | ERROR | main:93 - An error has been caught in function '', process 'MainProcess' (359), thread 'MainThread' (139974572922688): Traceback (most recent call last):

File "tools/train.py", line 93, in main(exp, args) │ │ └ Namespace(batch_size=16, ckpt='yolox_tiny.pkl', devices=1, exp_file='exps/default/yolox_tiny.py', experiment_name='yolox_tiny... │ └ ╒══════════════════╤═════════════════════════════════════════════════════════════════════════════════════════════════════════... └ <function main at 0x7f4e5d7308c0>

File "tools/train.py", line 73, in main trainer.train() │ └ <function Trainer.train at 0x7f4dec68b680> └ <yolox.core.trainer.Trainer object at 0x7f4d9a68a7d0>

File "/home/megstudio/workspace/YOLOX/yolox/core/trainer.py", line 46, in train self.before_train() │ └ <function Trainer.before_train at 0x7f4d9a6f55f0> └ <yolox.core.trainer.Trainer object at 0x7f4d9a68a7d0>

File "/home/megstudio/workspace/YOLOX/yolox/core/trainer.py", line 107, in before_train model = self.resume_train(model) │ │ └ YOLOX( │ │ (backbone): YOLOPAFPN( │ │ (backbone): CSPDarknet( │ │ (stem): Focus( │ │ (conv): BaseConv( │ │ (conv): ... │ └ <function Trainer.resume_train at 0x7f4d9a70c0e0> └ <yolox.core.trainer.Trainer object at 0x7f4d9a68a7d0>

File "/home/megstudio/workspace/YOLOX/yolox/core/trainer.py", line 249, in resume_train ckpt = mge.load(ckpt_file, map_location="cpu")["model"] │ │ └ 'yolox_tiny.pkl' │ └ <function load at 0x7f4df6c46680> └ <module 'megengine' from '/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/init.py'>

KeyError: 'model'

qunyuanchen commented 3 years ago

I load the pkl file and print it, i found that there is no a key called "model" in the pkl orderdict. So i try to delete the ["model"] in trainer.py in line 249, but the log show that the shape is not match. For example, in pkl orderdict, the shape of backbone.C3_n3.conv1.bn.bias is (1, 96, 1, 1) while the shape is (96, ) in yolox_tiny model.

Detail log: 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_n3.conv1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.C3_n3.conv1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_n3.conv2.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.C3_n3.conv2.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_n3.conv3.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.C3_n3.conv3.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_n3.m.0.conv1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.C3_n3.m.0.conv1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_n3.m.0.conv2.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.C3_n3.m.0.conv2.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_n4.conv1.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.C3_n4.conv1.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_n4.conv2.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.C3_n4.conv2.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_n4.conv3.bn.bias in checkpoint is (1, 384, 1, 1), while shape of backbone.C3_n4.conv3.bn.bias in model is (384,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_n4.m.0.conv1.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.C3_n4.m.0.conv1.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_n4.m.0.conv2.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.C3_n4.m.0.conv2.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_p3.conv1.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.C3_p3.conv1.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_p3.conv2.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.C3_p3.conv2.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_p3.conv3.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.C3_p3.conv3.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_p3.m.0.conv1.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.C3_p3.m.0.conv1.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_p3.m.0.conv2.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.C3_p3.m.0.conv2.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_p4.conv1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.C3_p4.conv1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_p4.conv2.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.C3_p4.conv2.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_p4.conv3.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.C3_p4.conv3.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_p4.m.0.conv1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.C3_p4.m.0.conv1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.C3_p4.m.0.conv2.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.C3_p4.m.0.conv2.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark2.0.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.backbone.dark2.0.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark2.1.conv1.bn.bias in checkpoint is (1, 24, 1, 1), while shape of backbone.backbone.dark2.1.conv1.bn.bias in model is (24,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark2.1.conv2.bn.bias in checkpoint is (1, 24, 1, 1), while shape of backbone.backbone.dark2.1.conv2.bn.bias in model is (24,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark2.1.conv3.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.backbone.dark2.1.conv3.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark2.1.m.0.conv1.bn.bias in checkpoint is (1, 24, 1, 1), while shape of backbone.backbone.dark2.1.m.0.conv1.bn.bias in model is (24,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark2.1.m.0.conv2.bn.bias in checkpoint is (1, 24, 1, 1), while shape of backbone.backbone.dark2.1.m.0.conv2.bn.bias in model is (24,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark3.0.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.backbone.dark3.0.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark3.1.conv1.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.backbone.dark3.1.conv1.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark3.1.conv2.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.backbone.dark3.1.conv2.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark3.1.conv3.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.backbone.dark3.1.conv3.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark3.1.m.0.conv1.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.backbone.dark3.1.m.0.conv1.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark3.1.m.0.conv2.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.backbone.dark3.1.m.0.conv2.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark3.1.m.1.conv1.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.backbone.dark3.1.m.1.conv1.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark3.1.m.1.conv2.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.backbone.dark3.1.m.1.conv2.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark3.1.m.2.conv1.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.backbone.dark3.1.m.2.conv1.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark3.1.m.2.conv2.bn.bias in checkpoint is (1, 48, 1, 1), while shape of backbone.backbone.dark3.1.m.2.conv2.bn.bias in model is (48,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark4.0.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.backbone.dark4.0.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark4.1.conv1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.backbone.dark4.1.conv1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark4.1.conv2.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.backbone.dark4.1.conv2.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark4.1.conv3.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.backbone.dark4.1.conv3.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark4.1.m.0.conv1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.backbone.dark4.1.m.0.conv1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark4.1.m.0.conv2.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.backbone.dark4.1.m.0.conv2.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark4.1.m.1.conv1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.backbone.dark4.1.m.1.conv1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark4.1.m.1.conv2.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.backbone.dark4.1.m.1.conv2.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark4.1.m.2.conv1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.backbone.dark4.1.m.2.conv1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark4.1.m.2.conv2.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.backbone.dark4.1.m.2.conv2.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark5.0.bn.bias in checkpoint is (1, 384, 1, 1), while shape of backbone.backbone.dark5.0.bn.bias in model is (384,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark5.1.conv1.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.backbone.dark5.1.conv1.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark5.1.conv2.bn.bias in checkpoint is (1, 384, 1, 1), while shape of backbone.backbone.dark5.1.conv2.bn.bias in model is (384,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark5.2.conv1.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.backbone.dark5.2.conv1.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark5.2.conv2.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.backbone.dark5.2.conv2.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark5.2.conv3.bn.bias in checkpoint is (1, 384, 1, 1), while shape of backbone.backbone.dark5.2.conv3.bn.bias in model is (384,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark5.2.m.0.conv1.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.backbone.dark5.2.m.0.conv1.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.dark5.2.m.0.conv2.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.backbone.dark5.2.m.0.conv2.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.stem.conv.bn.bias in checkpoint is (1, 24, 1, 1), while shape of backbone.backbone.stem.conv.bn.bias in model is (24,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.bu_conv1.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.bu_conv1.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.bu_conv2.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.bu_conv2.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.lateral_conv0.bn.bias in checkpoint is (1, 192, 1, 1), while shape of backbone.lateral_conv0.bn.bias in model is (192,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.reduce_conv1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of backbone.reduce_conv1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:18 - head.grids.0 is not in the ckpt. Please double check and see if this is desired. 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_convs.0.0.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.cls_convs.0.0.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_convs.0.1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.cls_convs.0.1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_convs.1.0.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.cls_convs.1.0.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_convs.1.1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.cls_convs.1.1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_convs.2.0.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.cls_convs.2.0.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_convs.2.1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.cls_convs.2.1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.0.bias in checkpoint is (1, 80, 1, 1), while shape of head.cls_preds.0.bias in model is (1, 5, 1, 1). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.0.weight in checkpoint is (80, 96, 1, 1), while shape of head.cls_preds.0.weight in model is (5, 96, 1, 1). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.1.bias in checkpoint is (1, 80, 1, 1), while shape of head.cls_preds.1.bias in model is (1, 5, 1, 1). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.1.weight in checkpoint is (80, 96, 1, 1), while shape of head.cls_preds.1.weight in model is (5, 96, 1, 1). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.2.bias in checkpoint is (1, 80, 1, 1), while shape of head.cls_preds.2.bias in model is (1, 5, 1, 1). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.2.weight in checkpoint is (80, 96, 1, 1), while shape of head.cls_preds.2.weight in model is (5, 96, 1, 1). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.reg_convs.0.0.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.reg_convs.0.0.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.reg_convs.0.1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.reg_convs.0.1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.reg_convs.1.0.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.reg_convs.1.0.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.reg_convs.1.1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.reg_convs.1.1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.reg_convs.2.0.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.reg_convs.2.0.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.reg_convs.2.1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.reg_convs.2.1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.stems.0.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.stems.0.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.stems.1.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.stems.1.bn.bias in model is (96,). 2021-09-15 13:50:42 | WARNING | yolox.utils.checkpoint:26 - Shape of head.stems.2.bn.bias in checkpoint is (1, 96, 1, 1), while shape of head.stems.2.bn.bias in model is (96,). 2021-09-15 13:50:42 | ERROR | main:93 - An error has been caught in function '', process 'MainProcess' (344), thread 'MainThread' (140352338609984): Traceback (most recent call last):

File "tools/train.py", line 93, in main(exp, args) │ │ └ Namespace(batch_size=16, ckpt='yolox_tiny.pkl', devices=1, exp_file='exps/default/yolox_tiny.py', experiment_name='yolox_tiny... │ └ ╒══════════════════╤═════════════════════════════════════════════════════════════════════════════════════════════════════════... └ <function main at 0x7fa65209f8c0>

File "tools/train.py", line 73, in main trainer.train() │ └ <function Trainer.train at 0x7fa5e0f53680> └ <yolox.core.trainer.Trainer object at 0x7fa58efff550>

File "/home/megstudio/workspace/YOLOX/yolox/core/trainer.py", line 46, in train self.before_train() │ └ <function Trainer.before_train at 0x7fa58f0655f0> └ <yolox.core.trainer.Trainer object at 0x7fa58efff550>

File "/home/megstudio/workspace/YOLOX/yolox/core/trainer.py", line 107, in before_train model = self.resume_train(model) │ │ └ YOLOX( │ │ (backbone): YOLOPAFPN( │ │ (backbone): CSPDarknet( │ │ (stem): Focus( │ │ (conv): BaseConv( │ │ (conv): ... │ └ <function Trainer.resume_train at 0x7fa58f07b0e0> └ <yolox.core.trainer.Trainer object at 0x7fa58efff550>

File "/home/megstudio/workspace/YOLOX/yolox/core/trainer.py", line 250, in resume_train model = load_ckpt(model, ckpt) │ │ └ OrderedDict([('backbone.backbone.stem.conv.conv.weight', array([[[[-8.23454990e-04, 9.05852169e-02, 2.45094169e-02], │ │ ... │ └ YOLOX( │ (backbone): YOLOPAFPN( │ (backbone): CSPDarknet( │ (stem): Focus( │ (conv): BaseConv( │ (conv): ... └ <function load_ckpt at 0x7fa5e0fe0680>

File "/home/megstudio/workspace/YOLOX/yolox/utils/checkpoint.py", line 33, in load_ckpt load_dict.pop("head.grids.{}".format(i)) │ │ └ 0 │ └ <method 'pop' of 'dict' objects> └ {'backbone.C3_n3.conv1.bn.running_mean': array([-0.05796095, 0.02735543, -0.09699834, 0.00755394, -0.05644868, -0.14...

KeyError: 'head.grids.0'

cmFighting commented 3 years ago

有两点：

需要在加载预训练模型的之后修改，直接加载就行，不需要通过['model']的key来提取模型
模型在迁移的时候似乎在几个位置不是很匹配，需要使用reshape调整一下这样

cmFighting commented 3 years ago

不过修改之后预训练也不是很稳定

File "/home/megstudio/workspace/yolox/YOLOX-main-meg/yolox/models/yolo_head.py", line 318, in get_losses loss_iou = (iou_loss(bbox_preds.reshape(-1, 4)[fg_masks], reg_targets)).sum() / num_fg │ │ │ │ │ └ 1 │ │ │ │ └ │ │ │ └ Tensor([False False False ... False False False], dtype=bool, device=xpux:0) │ │ └ <function ArrayMethodMixin.reshape at 0x7f20d8932c20> │ └ Tensor([[[ 4.1161 15.4116 9.1175 30.3238] │ [ 6.1525 15.1813 11.6859 30.4651] │ [ 19.3735 13.2771 39.6102 26.9062... └ <function iou_loss at 0x7f20600179e0>

File "/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/tensor/array_method.py", line 356, in getitem return _getitem(self, index) │ │ └ Tensor([False False False ... False False False], dtype=bool, device=xpux:0) │ └ Tensor([[ 4.1161 15.4116 9.1175 30.3238] │ [ 6.1525 15.1813 11.6859 30.4651] │ [ 19.3735 13.2771 39.6102 26.9062] │ ... └ <function getitem at 0x7f20d892c8c0> File "/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/tensor/indexing.py", line 217, in getitem if v.shape is None: │ └ <property object at 0x7f20d88d7110> └ Tensor([], dtype=int32, device=xpux:0) File "/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/tensor.py", line 104, in shape shape = super().shape

RuntimeError: assertion `dev_tensor_valid()' failed at ../../../../../../src/core/impl/graph/var_node.cpp:209: const DeviceTensorND& mgb::cg::VarNode::dev_tensor() const

backtrace: /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZN3mgb13MegBrainErrorC1ERKSs+0x4a) [0x7f20e83ac65a] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZN3mgb15__assert_fail__EPKciS1_S1_S1_z+0x10f) [0x7f20e83abbdf] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZNK3mgb2cg7VarNode10dev_tensorEv+0x8c) [0x7f20e84486ac] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZN3mgb3opr6Concat14scn_do_executeEv+0x32) [0x7f20e868f4d2] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(+0x286a720) [0x7f20e841f720] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZN3mgb2cg5mixin20SingleCNOperatorNode16mixin_do_executeERNS0_16OperatorNodeBaseERNS0_15GraphExecutable7ExecEnvE+0xc3) [0x7f20e841eae3] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZN3mgb2cg16OperatorNodeBase7executeERNS0_15GraphExecutable7ExecEnvE+0x30) [0x7f20e8423bd0] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x210d60) [0x7f213c1d6d60] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x225cf5) [0x7f213c1ebcf5] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x22611e) [0x7f213c1ec11e]

cmFighting commented 3 years ago

不过修改之后预训练也不是很稳定

File "/home/megstudio/workspace/yolox/YOLOX-main-meg/yolox/models/yolo_head.py", line 318, in get_losses loss_iou = (iou_loss(bbox_preds.reshape(-1, 4)[fg_masks], reg_targets)).sum() / num_fg │ │ │ │ │ └ 1 │ │ │ │ └ │ │ │ └ Tensor([False False False ... False False False], dtype=bool, device=xpux:0) │ │ └ <function ArrayMethodMixin.reshape at 0x7f20d8932c20> │ └ Tensor([[[ 4.1161 15.4116 9.1175 30.3238] │ [ 6.1525 15.1813 11.6859 30.4651] │ [ 19.3735 13.2771 39.6102 26.9062... └ <function iou_loss at 0x7f20600179e0>

File "/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/tensor/array_method.py", line 356, in getitem return _getitem(self, index) │ │ └ Tensor([False False False ... False False False], dtype=bool, device=xpux:0) │ └ Tensor([[ 4.1161 15.4116 9.1175 30.3238] │ [ 6.1525 15.1813 11.6859 30.4651] │ [ 19.3735 13.2771 39.6102 26.9062] │ ... └ <function getitem at 0x7f20d892c8c0> File "/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/tensor/indexing.py", line 217, in getitem if v.shape is None: │ └ <property object at 0x7f20d88d7110> └ Tensor([], dtype=int32, device=xpux:0) File "/home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/tensor.py", line 104, in shape shape = super().shape

RuntimeError: assertion `dev_tensor_valid()' failed at ../../../../../../src/core/impl/graph/var_node.cpp:209: const DeviceTensorND& mgb::cg::VarNode::dev_tensor() const

backtrace: /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZN3mgb13MegBrainErrorC1ERKSs+0x4a) [0x7f20e83ac65a] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZN3mgb15__assert_fail__EPKciS1_S1_S1_z+0x10f) [0x7f20e83abbdf] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZNK3mgb2cg7VarNode10dev_tensorEv+0x8c) [0x7f20e84486ac] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZN3mgb3opr6Concat14scn_do_executeEv+0x32) [0x7f20e868f4d2] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(+0x286a720) [0x7f20e841f720] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZN3mgb2cg5mixin20SingleCNOperatorNode16mixin_do_executeERNS0_16OperatorNodeBaseERNS0_15GraphExecutable7ExecEnvE+0xc3) [0x7f20e841eae3] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/lib/libmegengine_export.so(_ZN3mgb2cg16OperatorNodeBase7executeERNS0_15GraphExecutable7ExecEnvE+0x30) [0x7f20e8423bd0] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x210d60) [0x7f213c1d6d60] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x225cf5) [0x7f213c1ebcf5] /home/megstudio/.miniconda/envs/xuan/lib/python3.7/site-packages/megengine/core/_imperative_rt.cpython-37m-x86_64-linux-gnu.so(+0x22611e) [0x7f213c1ec11e]

经过反复测试之后，megengine这个版本不能设置batchsize为2，设置batchsize为4的时候目前没有报过这个错误，主要还是我的问题，因为我只有一张1080/(ㄒoㄒ)/~~

MegEngine / YOLOX

Why the yolox_tiny can not load the pretrain model correctly？ #5