Closed nicolasugrinovic closed 4 years ago
Can you show me the full error traceback? Also did you install our version of mmcv? We have made some modifications from the original in mmdetection.
Thanks for the quick reply!
I installed mmcv just as mentioned on your readme. Here is the full traceback:
unexpected key in source state_dict: fc.weight, fc.bias
missing keys in source state_dict: layer4.0.bn2.num_batches_tracked, layer3.4.bn1.num_batches_tracked, layer4.0.downsample.1.num_batches_tracked, layer1.0.downsample.1.num_batches_tracked, layer3.2.bn1.num_batches_tracked, layer1.2.bn3.num_batches_tracked, layer2.3.bn1.num_batches_tracked, layer2.0.bn2.num_batches_tracked, layer3.5.bn3.num_batches_tracked, layer3.0.downsample.1.num_batches_tracked, layer2.2.bn1.num_batches_tracked, layer1.0.bn1.num_batches_tracked, layer4.1.bn3.num_batches_tracked, layer1.1.bn2.num_batches_tracked, layer1.1.bn3.num_batches_tracked, layer3.0.bn1.num_batches_tracked, layer2.1.bn2.num_batches_tracked, layer2.1.bn3.num_batches_tracked, layer4.1.bn2.num_batches_tracked, layer2.0.bn1.num_batches_tracked, layer3.3.bn3.num_batches_tracked, layer4.0.bn3.num_batches_tracked, layer3.4.bn3.num_batches_tracked, layer2.2.bn2.num_batches_tracked, layer2.0.downsample.1.num_batches_tracked, layer1.2.bn2.num_batches_tracked, layer2.3.bn2.num_batches_tracked, layer4.1.bn1.num_batches_tracked, layer3.3.bn2.num_batches_tracked, layer3.3.bn1.num_batches_tracked, layer3.1.bn3.num_batches_tracked, layer3.0.bn3.num_batches_tracked, layer3.1.bn2.num_batches_tracked, layer3.4.bn2.num_batches_tracked, layer2.1.bn1.num_batches_tracked, layer4.2.bn1.num_batches_tracked, layer1.2.bn1.num_batches_tracked, layer3.2.bn2.num_batches_tracked, layer1.1.bn1.num_batches_tracked, layer4.2.bn3.num_batches_tracked, layer1.0.bn3.num_batches_tracked, layer3.1.bn1.num_batches_tracked, layer3.0.bn2.num_batches_tracked, layer1.0.bn2.num_batches_tracked, layer2.3.bn3.num_batches_tracked, layer2.0.bn3.num_batches_tracked, layer3.5.bn2.num_batches_tracked, layer4.0.bn1.num_batches_tracked, layer2.2.bn3.num_batches_tracked, layer3.5.bn1.num_batches_tracked, layer3.2.bn3.num_batches_tracked, bn1.num_batches_tracked, layer4.2.bn2.num_batches_tracked
2020-06-16 13:45:31,001 - INFO - load checkpoint from data/checkpoint.pt
2020-06-16 13:45:31,247 - WARNING - missing keys in source state_dict: smpl_head.smpl.v_template, smpl_head.loss.smpl.J_regressor_extra, smpl_head.loss.smpl.parents, smpl_head.smpl.posedirs, smpl_head.smpl.vertex_joint_selector.extra_joints_idxs, smpl_head.smpl.lbs_weights, smpl_head.smpl.J_regressor_extra, smpl_head.loss.smpl.vertex_joint_selector.extra_joints_idxs, smpl_head.loss.smpl.J_regressor, smpl_head.smpl.parents, smpl_head.smpl.J_regressor, smpl_head.loss.smpl.shapedirs, smpl_head.loss.smpl.posedirs, smpl_head.smpl.shapedirs, smpl_head.loss.smpl.faces_tensor, smpl_head.loss.smpl.v_template, smpl_head.loss.smpl.lbs_weights, smpl_head.smpl.faces_tensor
Traceback (most recent call last):
File "tools/demo.py", line 190, in <module>
main()
File "tools/demo.py", line 149, in main
runner.resume(cfg.resume_from)
File "/home/nugrinovic/miniconda3/envs/multiperson/lib/python3.7/site-packages/mmcv/runner/runner.py", line 313, in resume
self.optimizer.load_state_dict(checkpoint['optimizer'])
File "/home/nugrinovic/miniconda3/envs/multiperson/lib/python3.7/site-packages/torch/optim/optimizer.py", line 115, in load_state_dict
raise ValueError("loaded state dict contains a parameter group "
ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group
So as I can see from your traceback, Line 313 in mmcv/runner/runner.py is different from the one in the repo.
The line where we load the optimizer state is actually Line 350 https://github.com/JiangWenPL/multiperson/blob/c79c0f82f5273dbbf7bf6612c10527323cdab07b/mmcv/mmcv/runner/runner.py#L350
I suspect that you might have a different version of mmcv installed at some point. We had encountered this issue in the past and explicitly put the optimizer state loading under a try/except block.
A quick solution would be to run rm -rf /home/nugrinovic/miniconda3/envs/multiperson/lib/python3.7/site-packages/mmcv*
and then reinstall it. You might have to reinstall mmdetection if you do that.
That was exactly it! Somehow, I had another version installed. Now it is working, thanks!
Hi, kudos for the great work! Thank you for sharing the code.
When trying to run demo.py with the command
python3 tools/demo.py --config=configs/smpl/tune.py --image_folder=demo_images/ --output_folder=results/ --ckpt data/checkpoint.pt
I get the following error:
It seems to be triggered by something with the checkpoint and the model architecture. I also get the following message: