Open VeliBaba opened 9 years ago
Hi!
Ok. Is the ouput of the last hidden layer used at the input of the next neural network?
What is 'next neural network'? If you mean next timestamp (next word), then the answer is yes.
Yes, I mean this. Ok, thanks
Is it good at performance using several hidden layers instead of a single hidden layer? Which is better: use a single hidden layer with size 400, or 4 hidden layers with size 100?
First, when you increase layer in 4 times, training/evaluation time (in theory) is increased in 16 times (4 squared). So it's more resonable to compare 1 layer of size 400 with 4 layers of size 200. However, I would recomment to train a shallow network with a single first.
Hi! I have two different toolkits for training of the rnnlm: the first one is rnnlm-hs-0.1b (Ilya-multithreading), and the second one is faster-rnnlm.The faster-rnnlm is faster than rnnlm-hs-0.1b about 3 times with the same options. Is it expectable that valid entropy at the end of training may be worse with faster-rnnlm than rnnlm-hs-0.1b?
It's expected that the entropy will be more or less the same.
Hi! I have some questions about faster-rnnlm. There it is possible to use several hidden layers during training. My questions are: