facebookarchive / SCRNNs

This is a self contained software accompanying the paper titled: Learning Longer Memory in Recurrent Neural Networks: http://arxiv.org/abs/1412.7753.
Other
169 stars 54 forks source link

Questions #2

Closed joohongyoo closed 9 years ago

joohongyoo commented 9 years ago

Hi, I have a question.

Can anybody tell me whole option values of the paper's experiment setting?

Paper represents that options used in experiment like below.

alpha: 0.95 BPTT steps: 50(SCRNN) / 10(SRNN) BPTT freq: 5 batch size: 32 learning rate: 0.05 learning rate shrink: 1.5

Thus, I set the options with same values as above and I set rest options with default values. But I got different results. That's why I'm curious how to set whole option values in the paper.

Or, can what I did make differences? When I installed this toolkit, I compiled torch GPU libraries with option "sm_30" instead of default value(sm_20) because of my GPU card's compute capability(3.0). And, before I did the experiment, I replaced nn.LookupTableGPU( ) with nn.LookupTable( ) in "mfactory.lua" file because of same reason.

Thanks.

suchop commented 9 years ago

I believe the problem is in the hyper-parameter settings. After doing a grid search over the hyper-parameters the set of values that gave the best validation performance using various models are as follows.

srnn_sm: -batchsz 16 -blen 10 -bfreq 5 -eta 0.03 -etashrink 1.5 -gradinputclip 20 -gradclip 0 -cliptype scale

lstm_sm: -batchsz 16 -blen 20 -bfreq 5 -eta 0.08 -etashrink 1.5 -gradinputclip 80 -gradclip 0 -cliptype scale

scrnn_sm: -batchsz 1 -blen 50 -bfreq 5 -eta 1 -etashrink 2 -gradinputclip 10 -cliptype hard