model.load_state_dict(saved_state_dict) error

isht7 / pytorch-deeplab-resnet

DeepLab resnet v2 model in pytorch

MIT License

602 stars 117 forks source link

model.load_state_dict(saved_state_dict) error #13

Closed lianshushu closed 7 years ago

lianshushu commented 7 years ago

Excuse me , when i run 'python train.py',a mistake happened as follow:

File "train.py", line 222, in model.load_state_dict(saved_state_dict) File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 331, in load_state_dict .format(name)) KeyError: 'unexpected key "Scale.conv1.weight" in state_dict'

I use the coco pretrained model 'MS_DeepLab_resnet_pretrained_COCO_init.pth' to fine tune by voc , hope for response , Thank you !

isht7 commented 7 years ago

Hi, when did you clone the repo and when did you download the file 'MS_DeepLab_resnet_pretrained_COCO_init.pth'?

isht7 commented 7 years ago

I suspect that your repo is not updated and/or the .pth file you are using is old. Both the repo and the .pth files were changed some time ago. I recommend that you pull the repo and download the .pth file again. Get back to me if the error persists after doing the above.

lianshushu commented 7 years ago

Thank you so much,I foud the code is an old vision, I had solved this problem ,but it seems that the label size is not the same with the last feature map after model processed.

RuntimeError: input and target batch or spatial sizes don't match: target [1 x 19 x 19], input [1 x 21 x 20 x 20] at /b/wheel/pytorch-src/torch/lib/THCUNN/generic/SpatialClassNLLCriterion.cu:24 I am solving this problem now,Had you ever get this kind of problem?

lianshushu commented 7 years ago

Hi,it seems that ,you input image and label is resized to an random size: a = outS(321scale)#41 b = outS(3210.5*scale)#21 this might case the dimension size ,weith and height between last feature map and label not the same

isht7 commented 7 years ago

Hi, I have tested and my repo works properly even after the random scale. How many labels does your data have? By default there are 21 labels, but you can modify that using the argument --NoLabels. Also, what is the size of your input images? try to resize them to the shape (321,321,3) and then check again. Also, if nothing else works, you could disable scale augmentation and put scale = 1.

lianshushu commented 7 years ago

I disable scale augmentation and put scale = 1 and it works, but did not get you perfermance trained by pascal-voc train data. I will check my training method and consult you , thank you so much for your help!

isht7 commented 7 years ago

How much are you getting? Some drop(about ~1%) will be due to disabling of scale augmentation.

lianshushu commented 7 years ago

Hi, I got 67.5% with a random scale augmentation , I trained 3 times already. There is not CRF processing in you code ,right?

isht7 commented 7 years ago

no, there is no CRF, but you should be able to get ~72.4% accuracy on the validation set if using evalpyt2.py. You first said that you were not able to use the random scale augmentation. What did you change so that it worked for you? That might give me a hint regarding the reason for your lower performance.