karpathy / char-rnn

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch
11.53k stars 2.58k forks source link

Using multiple neuron layers > 3 results in NaN #183

Open wrapperband opened 7 years ago

wrapperband commented 7 years ago

I have tried multiple attempts over months to increase the depth of the nets layers (8- 10). If you watch the process seems fine with the individual training loss of each batch reducing. But at the end of the epoch it shows NAN i.e. infinity?

Although in one case I was able to "power through" the NaN, that hasn't happened since, despite trying up to150 epochs.

I am looking at this paper which seems to be addressing the problem.

http://torch.ch/blog/2016/02/04/resnets.html

danindiana commented 7 years ago

What rnn_size size values have you been using? What is the hardware you are running?