How to train a 2X model

xinntao / EDVR

Winning Solution in NTIRE19 Challenges on Video Restoration and Enhancement (CVPR19 Workshops) - Video Restoration with Enhanced Deformable Convolutional Networks. EDVR has been merged into BasicSR and this repo is a mirror of BasicSR.

https://github.com/xinntao/BasicSR

1.49k stars 318 forks source link

How to train a 2X model #130

Open leonselina opened 4 years ago

leonselina commented 4 years ago

I try to train a 2X model and changed the factor "scale" to '2'. but it did not works.... What's wrong ? Thanks:)

TouqeerAhmad commented 4 years ago

You have to edit the EDVR_arch.py file towards the end -- if you have a look at code lines 303 onwards, you can see there are two PixelShuffle layers being used -- each of which enhances the resolution by 2x in each dimension. You have to comment out one of these, specifically the second one.

You probably need to adjust the number of features for the subsequent conv layers too and the x_center should be interpolated by scale 2 instead of 4.

branimir29 commented 4 years ago

This doesn't seem to work. If you comment second PixelShuffle and x_center scale is 2, I get error: Given groups=1, weight of size 64 64 3 3, expected input[1, 128, 816, 1024] to have 64 channels, but got 128 channels instead

Any idea?

ryul99 commented 4 years ago

@branimir29 PixelShuffle is making (B, scale * scale * C, W, H) to (B, C, scale* W, scale * H). So if you just commented PixelShuffle, tensor has scale * scale * C not C. You should downscale Channel by Conv or something. This is my code https://github.com/ryul99/mmsr/commit/91b2912268f13fb9d8ce7cae2a9e037f3e06f669 https://github.com/ryul99/mmsr/commit/add1da64b28b989fc74d57f50addd11879f6e6bb https://github.com/ryul99/mmsr/commit/d77cbd94a29185a61018d2a771a30bcd8e22087b

besit commented 4 years ago

@branimir29 PixelShuffle is making (B, scale * scale * C, W, H) to (B, C, scale* W, scale * H). So if you just commented PixelShuffle, tensor has scale * scale * C not C. You should downscale Channel by Conv or something. This is my code ryul99/mmsr@91b2912 ryul99/mmsr@add1da6 ryul99/mmsr@d77cbd9

When I do this, I get:

Traceback (most recent call last):
  File "_my_test_Vid4_REDS4_woGT3.py", line 216, in <module>
    main()
  File "_my_test_Vid4_REDS4_woGT3.py", line 112, in main
    model.load_state_dict(torch.load(model_path), strict=True)
  File "/home/besit/anaconda3/envs/mmsr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 829, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for EDVR:
    Missing key(s) in state_dict: "upconv3.weight", "upconv3.bias".

I use EDVR from mmsr repository, but the only difference is that it has no .yml file with settings, while everything happens using test.py and EDVR_arch.py

Do you have any ideas, how I can fix this? thx

besit commented 4 years ago

Now it seems to me that I need to train a new model for 2x upscale, and I can't just use pretrained model provided.

Is that right?

ryul99 commented 4 years ago

@besit Yes you are right. This is because there is no weight for newly added Conv

besit commented 4 years ago

@ryul99 thanks, that helped. I set up the training and realized it would take about 1 month to calculate on my 2 gtx1080, so I might as well just stick to pretrained models with 4x after all. It was nice experience though.

LiAdReam commented 4 years ago

You have to edit the EDVR_arch.py file towards the end -- if you have a look at code lines 303 onwards, you can see there are two PixelShuffle layers being used -- each of which enhances the resolution by 2x in each dimension. You have to comment out one of these, specifically the second one.

You probably need to adjust the number of features for the subsequent conv layers too and the x_center should be interpolated by scale 2 instead of 4.

Hello,do you know how to train a deblur model in 2 stage?Xintao said that During the second-stage training, the inputs of the second stage are the outputs of the first stage.But I dont know what it means.

Now I have my own dataset and REDS dataset.How to mix this two dataset?I have two models,one is from my dataset,the other is from REDS dataset's deblur part.My pics through first model and second model can fulfill my request.But I hope this two models can be one model.Can two stage training works?

ckkelvinchan commented 4 years ago

@LiAdReam I do not get your question. Could you explain it a bit?

About the second stage, you can imagine it as concatenating another (smaller) network to the original EDVR, and fix the weights of the original EDVR. If you prefer generating the images using the first-stage model before training the second-stage model, you can modify the dataset code to load those images (but it takes up your storage space).

LiAdReam commented 4 years ago

@LiAdReam I do not get your question. Could you explain it a bit?

About the second stage, you can imagine it as concatenating another (smaller) network to the original EDVR, and fix the weights of the original EDVR. If you prefer generating the images using the first-stage model before training the second-stage model, you can modify the dataset code to load those images (but it takes up your storage space).

1.I test the pics whose size are 1920×1080，but something wrong.Xintao said the width and heights of the input image should be a multiple of 4,but 1920 and 1080 are multiple of 4.Make me cry..... 2.Do you know how to train a 2 stage model?Xintao said that During the second-stage training, the inputs of the second stage are the outputs of the first stage.Is that means that I train for the first time,get the model and use the low-quality pic of my dataset as a test dataset and let it pass through the model I train to get new pics.Use the new pic as low-quality pics,the original high-quality pics as high-quality pics,train for the second time?It make me confused. 3.While I try to train a 2 stage model is that I have two dataset ,one is my dataset,the other is REDS.I train my dataset to get model A,train REDS dataset to get model B.I make my test pics through model A,it's quality was improved but still with some blur.So after that I let it through model B,and get the pic I want.I hope I can mix model A and model B,so I try to train a 2 stage model.I dont know if it works.Can you offer me some methods?

ryul99 commented 4 years ago

@LiAdReam 1: for deblur model, input images' size should be multiple of 16. Please see https://github.com/xinntao/EDVR/issues/48#issuecomment-531836951. 2,3: As I know, 2 stage model is just concatenating 2 EDVR models for 1 dataset. But I'm not sure cause I haven't used 2 stage model.

LiAdReam commented 4 years ago

@LiAdReam 1: for deblur model, input images' size should be multiple of 16. Please see #48 (comment). 2,3: As I know, 2 stage model is just concatenating 2 EDVR models for 1 dataset. But I'm not sure cause I haven't used 2 stage model.

Thank you for your reply!!!So nice!