I trained bts (pytorch) on NYU for few epochs, but during testing, I'm getting the following error while loading the model:
Traceback (most recent call last):
File "bts_test.py", line 221, in <module>
test(args)
File "bts_test.py", line 94, in test
model.load_state_dict(checkpoint['model'])
File "/nfs/interns/kharshit/miniconda3/envs/pylatest/lib/python3.7/site-packages/torch/nn/modules/module.py", line 845, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.encoder.base_model.denseblock3.denselayer25.norm1.weight", "module.encoder.base_model.denseblock3.denselayer25.norm1.bias", "module.encoder.base_model.denseblock3.denselayer25.norm1.running_mean", "module.encoder.base_model.denseblock3.denselayer25.norm1.running_var", "module.encoder.base_model.denseblock3.denselayer25.conv1.weight", "module.encoder.base_model.denseblock3.denselayer25.norm2.weight", "module.encoder.base_model.denseblock3.denselayer25.norm2.bias",
...
size mismatch for module.encoder.base_model.conv0.weight: copying a param with shape torch.Size([64, 3, 7, 7]) from checkpoint, the shape in current model is torch.Size([96, 3, 7, 7]).
size mismatch for module.encoder.base_model.norm0.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for module.encoder.base_model.norm0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([96]).
...
The only thing I changed was to remove --multiprocessing_distributed from the arguments_train_nyu.txt file and added --gpu 0 and run script using CUDA_VISIBLE_DEVICES=0 to train using single GPU.
I trained bts (pytorch) on NYU for few epochs, but during testing, I'm getting the following error while loading the model:
The only thing I changed was to remove
--multiprocessing_distributed
from thearguments_train_nyu.txt
file and added--gpu 0
and run script usingCUDA_VISIBLE_DEVICES=0
to train using single GPU.