Open yxiao009 opened 5 years ago
Did you retrain your model? In that case, download the newest version. Previous version of my code has some error. Also check whether the pre-train model exist or not.
Did you retrain your model? In that case, download the newest version. Previous version of my code has some error. Also check whether the pre-train model exist or not.
Yes, I tried both with and without using the pre-train model. When using the pre-train model, it gives me nan loss at the very beginning, and when retraining the model without using the pre-train model, it gives me nan loss after a few iterations before finishing the first epoch. I also print out the outputs for model[0], the first few iterations return values, but after like 5 iters, it starts to return nan. Please help~
Hi did you solve the problem? I also have the same issue.
@wl082013 Nope... But it seems nan occurs from model[0] and model[3] after several iterations, so I'm going to look into the details in model[0] (same structure as model[3]).
@wl082013 Nope... But it seems nan occurs from model[0] and model[3] after several iterations, so I'm going to look into the details in model[0] (same structure as model[3]).
Hi!Did you solve this problem?I didn't get the nan loss, but every time when i start train i got a very large loss, e.g. 1.3e14.How can i fix this?Would you please give some advices?Thank you!
@huaixu16 Did you able to solve the problem?
I figured it out, change args.res_scale
to 0.1 which by default it is 4.0. This will cause nan value to appear after going through res net part of EDSR.
I figured it out, change
args.res_scale
to 0.1 which by default it is 4.0. This will cause nan value to appear after going through res net part of EDSR. 4.0 is the super resolution scale. In the original paper "Enhanced Deep Residual Networks for Single Image Super-Resolution", it is described as follows, "we found that increasing the number of feature maps above a certain level would make the training procedure numerically unstable. A similar phenomenon was reported by Szegedy et al. [24]. We resolve this issue by adopting the residual scaling [24] with factor 0.1." I changed factor into 0.1, however the problem hasn't been solved. Could you describe more details, I'll be very grateful to you.
@shivaang12
Hi,
I'm getting all losses as nan, and I find out that in main.py, around line 200, dn_ = model0 returns all nan outputs. I print out the input_v and the input_v looks correct. Would you please let me know what shall I do?
Thanks!