xinntao / ESRGAN

ECCV18 Workshops - Enhanced SRGAN. Champion PIRM Challenge on Perceptual Super-Resolution. The training codes are in BasicSR.
https://github.com/xinntao/BasicSR
Apache License 2.0
5.91k stars 1.05k forks source link

Errors in loading state_dict for RRDBNet #54

Open shiqi1994 opened 5 years ago

shiqi1994 commented 5 years ago

Here is part of the error messages :

19-06-06 16:34:38.242 - INFO: Loading pretrained model for G [/home/sqtang/.config/spyder-py3/ESRGAN/experiments/pretrained_models/RRDB_ESRGAN_x4.pth] ... Traceback (most recent call last): File "train.py", line 173, in <module> main() File "train.py", line 83, in main model = create_model(opt) File "/home/sqtang/.config/spyder-py3/ESRGAN/codes/models/__init__.py", line 18, in create_model m = M(opt) File "/home/sqtang/.config/spyder-py3/ESRGAN/codes/models/SRRaGAN_model.py", line 26, in __init__ self.load() # load G and D if needed File "/home/sqtang/.config/spyder-py3/ESRGAN/codes/models/SRRaGAN_model.py", line 243, in load self.load_network(load_path_G, self.netG) File "/home/sqtang/.config/spyder-py3/ESRGAN/codes/models/base_model.py", line 63, in load_network network.load_state_dict(torch.load(load_path), strict=strict) File "/home/sqtang/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for RRDBNet: Missing key(s) in state_dict: "model.0.weight", "model.0.bias", "model.1.sub.0.RDB1.conv1.0.weight", "model.1.sub.0.RDB1.conv1.0.bias", "model.1.sub.0.RDB1.conv2.0.weight", "model.1.sub.0.RDB1.conv2.0.bias", "model.1.sub.0.RDB1.conv3.0.weight", "model.1.sub.0.RDB1.conv3.0.bias", "model.1.sub.0.RDB1.conv4.0.weight", "model.1.sub.0.RDB1.conv4.0.bias", "model.1.sub.0.RDB1.conv5.0.weight", "model.1.sub.0.RDB1.conv5.0.bias", "model.1.sub.0.RDB2.conv1.0.weight", "model.1.sub.0.RDB2.conv1.0.bias", "model.1.sub.0.RDB2.conv2.0.weight", "model.1.sub.0.RDB2.conv2.0.bias", "model.1.sub.0.RDB2.conv3.0.weight", "model.1.sub.0.RDB2.conv3.0.bias", "model.1.sub.0.RDB2.conv4.0.weight", "model.1.sub.0.RDB2.conv4.0.bias", "model.1.sub.0.RDB2.conv5.0.weight", "model.1.sub.0.RDB2.conv5.0.bias", "model.1.sub.0.RDB3.conv1.0.weight", "model.1.sub.0.RDB3.conv1.0.bias", "model.1.sub.0.RDB3.conv2.0.weight", "model.1.sub.0.RDB3.conv2.0.bias", "model.1.sub.0.RDB3.conv3.0.weight", "model.1.sub.0.RDB3.conv3.0.bias", "model.1.sub.0.RDB3.conv4.0.weight", "model.1.sub.0.RDB3.conv4.0.bias", "model.1.sub.0.RDB3.conv5.0.weight", "model.1.sub.0.RDB3.conv5.0.bias", "model.1.sub.1.RDB1.conv1.0.weight", "model.1.sub.1.RDB1.conv1.0.bias", "model.1.sub.1.RDB1.conv2.0.weight", "model.1.sub.1.RDB1.conv2.0.bias", "model.1.sub.1.RDB1.conv3.0.weight", "model.1.sub.1.RDB1.conv3.0.bias", "model.1.sub.1.RDB1.conv4.0.weight", "model.1.sub.1.RDB1.conv4.0.bias", "model.1.sub.1.RDB1.conv5.0.weight", "model.1.sub.1.RDB1.conv5.0.bias", "model.1.sub.1.RDB2.conv1.0.weight", "model.1.sub.1.RDB2.conv1.0.bias", "model.1.sub.1.RDB2.conv2.0.weight", "model.1.sub.1.RDB2.conv2.0.bias", "model.1.sub.1.RDB2.conv3.0.weight", "model.1.sub.1.RDB2.conv3.0.bias", "model.1.sub.1.RDB2.conv4.0.weight", "model.1.sub.1.RDB2.conv4.0.bias", "model.1.sub.1.RDB2.conv5.0.weight", "model.1.sub.1.RDB2.conv5.0.bias", "model.1.sub.1.RDB3.conv1.0.weight", "model.1.sub.1.RDB3.conv1.0.bias", "model.1.sub.1.RDB3.conv2.0.weight", "model.1.sub.1.RDB3.conv2.0.bias", "model.1.sub.1.RDB3.conv3.0.weight", "model.1.sub.1.RDB3.conv3.0.bias", "model.1.sub.1.RDB3.conv4.0.weight", "model.1.sub.1.RDB3.conv4.0.bias", "model.1.sub.1.RDB3.conv5.0.weight", "model.1.sub.1.RDB3.conv5.0.bias", "model.1.sub.2.RDB1.conv1.0.weight", "model.1.sub.2.RDB1.conv1.0.bias", "model.1.sub.2.RDB1.conv2.0.weight", "model.1.sub.2.RDB1.conv2.0.bias", "model.1.sub.2.RDB1.conv3.0.weight", "model.1.sub.2.RDB1.conv3.0.bias", "model.1.sub.2.RDB1.conv4.0.weight", "model.1.sub.2.RDB1.conv4.0.bias", "model.1.sub.2.RDB1.conv5.0.weight", "model.1.sub.2.RDB1.conv5.0.bias", "model.1.sub.2.RDB2.conv1.0.weight", "model.1.sub.2.RDB2.conv1.0.bias", "model.1.sub.2.RDB2.conv2.0.weight", "model.1.sub.2.RDB2.conv2.0.bias", "model.1.sub.2.RDB2.conv3.0.weight", "model.1.sub.2.RDB2.conv3.0.bias", "model.1.sub.2.RDB2.conv4.0.weight", "model.1.sub.2.RDB2.conv4.0.bias", "model.1.sub.2.RDB2.conv5.0.weight", "model.1.sub.2.RDB2.conv5.0.bias", "model.1.sub.2.RDB3.conv1.0.weight", "model.1.sub.2.RDB3.conv1.0.bias", "model.1.sub.2.RDB3.conv2.0.weight", "model.1.sub.2.RDB3.conv2.0.bias", "model.1.sub.2.RDB3.conv3.0.weight", "model.1.sub.2.RDB3.conv3.0.bias", "model.1.sub.2.RDB3.conv4.0.weight", "model.1.sub.2.RDB3.conv4.0.bias", "model.1.sub.2.RDB3.conv5.0.weight", "model.1.sub.2.RDB3.conv5.0.bias", "model.1.sub.3.RDB1.conv1.0.weight", "model.1.sub.3.RDB1.conv1.0.bias", "model.1.sub.3.RDB1.conv2.0.weight", "model.1.sub.3.RDB1.conv2.0.bias", "model.1.sub.3.RDB1.conv3.0.weight", "model.1.sub.3.RDB1.conv3.0.bias", "model.1.sub.3.RDB1.conv4.0.weight", "model.1.sub.3.RDB1.conv4.0.bias", "model.1.sub.3.RDB1.conv5.0.weight", "model.1.sub.3.RDB1.conv5.0.bias",

I am trying to train x4 network and I follow the instruction: 'Prerapre the PSNR-oriented pretrained model. You can use the RRDB_PSNR_x4.pth as the pretrained model.' . But it seems something wrong when load the retrained model G. Your early feedback will be appreciated! Thank you very much!

xinntao commented 5 years ago

Did you use the codes current ESRGAN repo? We have updated the network structures in this ESRGAN repo, but the BasicSR repo is still old.

You may need to use the model with the suffix _old_arch.pth

shiqi1994 commented 5 years ago

Did you use the codes current ESRGAN repo? We have updated the network structures in this ESRGAN repo, but the BasicSR repo is still old.

You may need to use the model with the suffix _old_arch.pth

Thank you very much! Problem solved by loading the model with suffix '_old_arch.pth'. May I ask how long it takes to train the model? It seems takes at least 4 mins to run 200 iterations.

xinntao commented 5 years ago

Yes, it depends on your GPU type. But it is very slow. It may take about 3~4 days or even longer.

shiqi1994 commented 5 years ago

Yes, it depends on your GPU type. But it is very slow. It may take about 3~4 days or even longer.

Thank you very much!

AbdelsalamHaa commented 4 years ago

Hi @xinntao I'm facing the same error although I used the _old_arch.pth model but still have the same error. any idea ?