Open CoderUnkn0wn opened 3 months ago
how did u break it? @CoderUnkn0wn
You can see in the image all of the settings I used. Then I just let it run and suddenly everything was NaN.
coooll
this is not an issue, it is normal behavior.
"Breaking" the model by causing the weights to blow up to infinity isn't difficult to do. Setting the learning rate high at the start of a complex model causes the weights to explode to high values, evaluate to Infinity, and then the next epoch reports them all as "NaN."
I think it is normal behavior— it's just the way the math works out. Setting the learning rate so high is bound to give inaccurate results.
This kind of playground is the perfect place to be able to play with settings like that and experience how the models behave when given diverse kinds of inputs. The researcher student will quickly learn to be more intentional about the use of a high learning rate!
I was able to reproduce the behavior previously reported by doing the following:
NaN