xinntao / EDVR

Winning Solution in NTIRE19 Challenges on Video Restoration and Enhancement (CVPR19 Workshops) - Video Restoration with Enhanced Deformable Convolutional Networks. EDVR has been merged into BasicSR and this repo is a mirror of BasicSR.
https://github.com/xinntao/BasicSR
1.48k stars 320 forks source link

How about this warning information? #62

Closed CaiQiuYu closed 5 years ago

CaiQiuYu commented 5 years ago

19-07-14 21:35:36.666 - WARNING: Offset mean is 9624.6220703125, larger than 100. 19-07-14 21:35:36.669 - WARNING: Offset mean is 65085.6953125, larger than 100. 19-07-14 21:35:36.674 - WARNING: Offset mean is 189.6161651611328, larger than 100. 19-07-14 21:35:36.675 - WARNING: Offset mean is 9624.8037109375, larger than 100. 19-07-14 21:35:36.678 - WARNING: Offset mean is 65086.70703125, larger than 100. 19-07-14 21:35:39.024 - WARNING: Offset mean is 168.29945373535156, larger than 100. 19-07-14 21:35:39.026 - WARNING: Offset mean is 8571.626953125, larger than 100. 19-07-14 21:35:39.029 - WARNING: Offset mean is 57954.5703125, larger than 100. 19-07-14 21:35:39.034 - WARNING: Offset mean is 168.32696533203125, larger than 100. 19-07-14 21:35:39.036 - WARNING: Offset mean is 8572.4208984375, larger than 100. 19-07-14 21:35:39.039 - WARNING: Offset mean is 57959.84375, larger than 100. 19-07-14 21:35:39.044 - WARNING: Offset mean is 168.1299285888672, larger than 100. 19-07-14 21:35:39.045 - WARNING: Offset mean is 8562.78125, larger than 100. 19-07-14 21:35:39.048 - WARNING: Offset mean is 57894.6640625, larger than 100. 19-07-14 21:35:39.053 - WARNING: Offset mean is 167.73036193847656, larger than 100. 19-07-14 21:35:39.054 - WARNING: Offset mean is 8543.248046875, larger than 100. 19-07-14 21:35:39.057 - WARNING: Offset mean is 57762.5, larger than 100. 19-07-14 21:35:39.062 - WARNING: Offset mean is 167.62863159179688, larger than 100. 19-07-14 21:35:39.063 - WARNING: Offset mean is 8538.310546875, larger than 100. 19-07-14 21:35:39.066 - WARNING: Offset mean is 57729.140625, larger than 100. 19-07-14 21:35:39.377 - WARNING: Offset mean is 177.97457885742188, larger than 100. 19-07-14 21:35:39.378 - WARNING: Offset mean is 9058.482421875, larger than 100. 19-07-14 21:35:39.381 - WARNING: Offset mean is 61246.58984375, larger than 100. 19-07-14 21:35:39.386 - WARNING: Offset mean is 178.12855529785156, larger than 100. 19-07-14 21:35:39.387 - WARNING: Offset mean is 9065.642578125, larger than 100. 19-07-14 21:35:39.390 - WARNING: Offset mean is 61294.6796875, larger than 100. 19-07-14 21:35:39.395 - WARNING: Offset mean is 177.98928833007812, larger than 100. 19-07-14 21:35:39.396 - WARNING: Offset mean is 9058.904296875, larger than 100. 19-07-14 21:35:39.399 - WARNING: Offset mean is 61249.17578125, larger than 100. 19-07-14 21:35:39.404 - WARNING: Offset mean is 178.05380249023438, larger than 100. 19-07-14 21:35:39.405 - WARNING: Offset mean is 9062.0673828125, larger than 100. 19-07-14 21:35:39.408 - WARNING: Offset mean is 61270.58203125, larger than 100. 19-07-14 21:35:39.412 - WARNING: Offset mean is 178.0281524658203, larger than 100. 19-07-14 21:35:39.413 - WARNING: Offset mean is 9060.7783203125, larger than 100. 19-07-14 21:35:39.416 - WARNING: Offset mean is 61261.84765625, larger than 100.

Is there anyone have the same problem? It sames like the training is going on.

liuchongwei commented 5 years ago

可以参考 #22 和 #16

CaiQiuYu commented 5 years ago

可以参考 #22 和 #16

@liuchongwei Thanks for your help!

xinntao commented 5 years ago

When the offset is larger than 100, it means the learned offset is meaningless and useless. You can just stop it and resume from the latest normal checkpoints.

We found the training of EDVR is sometimes unstable and has unreasonable offsets.

CaiQiuYu commented 5 years ago

When the offset is larger than 100, it means the learned offset is meaningless and useless. You can just stop it and resume from the latest normal checkpoints.

We found the training of EDVR is sometimes unstable and has unreasonable offsets.

@xinntao Thanks for your replying! I will try your solution.

taily-khucnaykhongquantrong commented 5 years ago

@xinntao I got this error only in 50k iter, I have tried resuming from iter 48k, but it keeps warning me from iter 50k. Is there anyway to solve this? Thanks in advance :D

@CaiQiuYu Did you solve this by that solution?

xinntao commented 5 years ago

@young666 Make sure that the model of 48k does not have any warning.

For me, I have trained several models (EDVR-M) with eight GPUs and I rarely meet this error with the provided training strategy. What's your configuration?

taily-khucnaykhongquantrong commented 5 years ago

@xinntao Here is my configuration:

#### network structures
network_G:
  which_model_G: EDVR
  nf: 128
  nframes: 5
  groups: 8
  front_RBs: 5
  back_RBs: 40
  predeblur: true
  HR_in: false
  w_TSA: true

#### path
path:
  pretrain_model_G: ../experiments/pretrained_models/EDVR_REDS_SRblur_L.pth

Yes 48k does not have any warning. I am using one GPU, 1080ti.

xinntao commented 5 years ago

I see. The large model is more likely to have this error.

From the config file, you are finetuning the model from EDVR_REDS_SRblur_L. It is a practice to use a smaller learning rate when finetuning, like 2e-4 or 1e-4.