karpathy / char-rnn

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch
11.59k stars 2.58k forks source link

Add a check to see if training loss has gone to infinity #123

Open wrapperband opened 8 years ago

wrapperband commented 8 years ago

I've had training loss go to infinity a few times now.

I thought it may have been due to bad characters, but it has now happened on, tested data, so I assume sometimes the net goes unstable. There is a slight correlation with trying to test different training and validation proportions. (it would be interesting to know what ones work in what circumstances and why)

I have tried to recover those brains and they seem to remain unstable, even if "tightly trained".

So the whatever the reason, there are possible improvements to handle infinity e.g. drop neurons option? or net scan tool that can re initialises each value randomly until a training round is OK.,

I am also interested if those brains might be "creative" and could be trained back, theoretically. Or not so I don't waste processing time.

An option to stop processing if learning rate goes to infinity would be a good first quick fix.

e.g -max_training_loss 100000