I am brand new to both Torch and RNNs, but I think my code is correct. I would love feedback on it! What this PR does is calculate the log probability of the sampled text using the following equation:
You have to train with the new code too, since it adds the empirical log probability of each character to the checkpoints (that info is calculated CharSplitLMMinibatchLoader.text_to_tensor()).
Using the scripts is not any different than it was before, but the output of sample.lua gets a little additional information:
(The following is from the default model trained on the tinyshakespeare data)
$ th sample.lua cv/lm_lstm_epoch26.00_1.3939.t7 -primetext "First" -length 5
package cunn not found!
package cutorch not found!
Falling back on CPU mode
creating an lstm...
seeding with First
--------------------------
First,
He
Sample log probability: -23.988819562995
$ th sample.lua cv/lm_lstm_epoch26.00_1.3939.t7 -primetext "First" -length 0
package cunn not found!
package cutorch not found!
Falling back on CPU mode
creating an lstm...
seeding with First
--------------------------
First
Sample log probability: -17.354374113096
$ th sample.lua cv/lm_lstm_epoch26.00_1.3939.t7 -primetext "Fxrst" -length 0
package cunn not found!
package cutorch not found!
Falling back on CPU mode
creating an lstm...
seeding with Fxrst
--------------------------
Fxrst
Sample log probability: -36.432944912158
An additional note: if temperature != 1, the output still works (and takes temperature into account), but I'm not sure it is correct to call it the sample probability any more.
I am brand new to both Torch and RNNs, but I think my code is correct. I would love feedback on it! What this PR does is calculate the log probability of the sampled text using the following equation:
P(x1, x2, x3, ..., xn) = P(x1) * P(x2 | x1) * P(x3 | x1, x2) ... * P(xn | x1, x2, x3 ... xn-1)
You have to train with the new code too, since it adds the empirical log probability of each character to the checkpoints (that info is calculated
CharSplitLMMinibatchLoader.text_to_tensor()
).Using the scripts is not any different than it was before, but the output of sample.lua gets a little additional information:
(The following is from the default model trained on the tinyshakespeare data)