Closed un-knight closed 6 years ago
@un-knight @yjxiong when I run the training code I got the following error
"/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 721, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BNInception:
While copying the parameter named "conv1_7x7_s2_bn.running_var",
whose dimensions in the model are torch.Size([64]) and
whose dimensions in the checkpoint are torch.Size([1, 64]).
The environment of my pytorch is:
>>> torch.__version__
'0.4.0'
>>> torchvision.__version__
'0.2.1'
any suggestion on solving this? thanks.
@RyanCV downgrading your pytorch from 0.4.0 to 0.3.1 solves the issue, worked for me!
Reference: https://github.com/Cadene/tensorflow-model-zoo.torch/issues/8
Thanks for your great job! But when I train TSN flow model on myself datasets(There are about 25000 training examples), the training loss and test loss cannot be reduced anymore when it decreased to about 1.8. After that, the training loss and test loss will stabilise at about 1.8, even though I have tried to decrease learning rate and increase training loop.
My training strategies are the same as what you write down on "readme.md".
I don't know why the training loss will get stuck in 1.8, and top1 accuracy of training set is only about 60%.
Does there any other methods that I can try to fix the proplem? Will Adam be more efficiency than SGD?