Closed YujiOshima closed 7 years ago
rnn_cell_demo.py named built in RNN parameters as "LSTM_bias", which is initialized as all zero. It significantly reduces the convergence speed. I changed the initializer and acquired similar convergence speed like others.
Thank you @mz24cn ! Since I do not know how to initialize the bias to sym.RNN, can you show me if you have sample code to initialize the bias?
I will commit my code in next several days.
Great! It's a big help. I am looking forward to your commit.
I have submitted a PR: https://github.com/dmlc/mxnet/pull/4819
This issue is closed due to lack of activity in the last 90 days. Feel free to reopen if this is still an active issue. Thanks!
For bugs or installation issues, please provide the following information. The more information you provide, the more likely people will be able to help you.
Environment info
Operating System: Ubuntu 14.04.03 (running on docker. docker host is Ubuntu 16.04)
Compiler: gcc 4.8.4
Package used (Python/R/Scala/Julia): Python
MXNet commit hash (
git rev-parse HEAD
): b6e8eec8b94c70d9e116b3a4443ce75ce3e07aa2If you are using python package, please provide
Python version and distribution: Python 2.7.6
Question
I think that the following three objects implement the same purpose differently.
But the results of Perplexity convergence are different.
rnn_cell_demo_batch_major
rnn_cell_demo_time_major
The parameters are as follows at all implementation.
What is the difference between these three implementations?