Closed joohongyoo closed 9 years ago
I believe the problem is in the hyper-parameter settings. After doing a grid search over the hyper-parameters the set of values that gave the best validation performance using various models are as follows.
srnn_sm: -batchsz 16 -blen 10 -bfreq 5 -eta 0.03 -etashrink 1.5 -gradinputclip 20 -gradclip 0 -cliptype scale
lstm_sm: -batchsz 16 -blen 20 -bfreq 5 -eta 0.08 -etashrink 1.5 -gradinputclip 80 -gradclip 0 -cliptype scale
scrnn_sm: -batchsz 1 -blen 50 -bfreq 5 -eta 1 -etashrink 2 -gradinputclip 10 -cliptype hard
Hi, I have a question.
Can anybody tell me whole option values of the paper's experiment setting?
Paper represents that options used in experiment like below.
alpha: 0.95 BPTT steps: 50(SCRNN) / 10(SRNN) BPTT freq: 5 batch size: 32 learning rate: 0.05 learning rate shrink: 1.5
Thus, I set the options with same values as above and I set rest options with default values. But I got different results. That's why I'm curious how to set whole option values in the paper.
Or, can what I did make differences? When I installed this toolkit, I compiled torch GPU libraries with option "sm_30" instead of default value(sm_20) because of my GPU card's compute capability(3.0). And, before I did the experiment, I replaced nn.LookupTableGPU( ) with nn.LookupTable( ) in "mfactory.lua" file because of same reason.
Thanks.