karpathy / char-rnn

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch
11.59k stars 2.58k forks source link

added log probability of sample to sample output #151

Open FragLegs opened 8 years ago

FragLegs commented 8 years ago

I am brand new to both Torch and RNNs, but I think my code is correct. I would love feedback on it! What this PR does is calculate the log probability of the sampled text using the following equation:

P(x1, x2, x3, ..., xn) = P(x1) * P(x2 | x1) * P(x3 | x1, x2) ... * P(xn | x1, x2, x3 ... xn-1)

You have to train with the new code too, since it adds the empirical log probability of each character to the checkpoints (that info is calculated CharSplitLMMinibatchLoader.text_to_tensor()).

Using the scripts is not any different than it was before, but the output of sample.lua gets a little additional information:

(The following is from the default model trained on the tinyshakespeare data)

$ th sample.lua cv/lm_lstm_epoch26.00_1.3939.t7 -primetext "First" -length 5
package cunn not found! 
package cutorch not found!  
Falling back on CPU mode    
creating an lstm... 
seeding with First  
--------------------------  
First,
He 

Sample log probability: -23.988819562995
$ th sample.lua cv/lm_lstm_epoch26.00_1.3939.t7 -primetext "First" -length 0
package cunn not found! 
package cutorch not found!  
Falling back on CPU mode    
creating an lstm... 
seeding with First  
--------------------------  
First

Sample log probability: -17.354374113096
$ th sample.lua cv/lm_lstm_epoch26.00_1.3939.t7 -primetext "Fxrst" -length 0
package cunn not found! 
package cutorch not found!  
Falling back on CPU mode    
creating an lstm... 
seeding with Fxrst  
--------------------------  
Fxrst

Sample log probability: -36.432944912158
FragLegs commented 8 years ago

An additional note: if temperature != 1, the output still works (and takes temperature into account), but I'm not sure it is correct to call it the sample probability any more.