Closed antinucleon closed 8 years ago
What is cusolver referring to?
SVD stuff
I see. Thanks!
I am interested in the part of "more visualization". How is the progress now? What methods are you ready to implement? Maybe I also could contribute some code.
@loveisp Great thanks Dai! I have done nothing yet. You are more than welcome to PR to https://github.com/dmlc/mxnet/blob/master/python/mxnet/visualization.py . Currently I think we need some functions to visualize filer, feature map and rnn structure. If you have more great ideas, feel free to PR!
@antinucleon OK, I will try to PR:)
what's the example we are going to show lstm? i vote for char-lstm, char-level language model. https://github.com/karpathy/char-rnn
I have made a PTB example https://github.com/antinucleon/mxnet/blob/master/example/LSTM/PennTree.ipynb , however the behavior is abnormal when compared to torch.
At first 0.8 round, it is almost same to torch one, then torch make perplexity from 700 to 300 in the last 0.2 of the first round, but ours reduce from 700 to 670.
I spent a whole day on it and no luck to find out the reason. On Mon, Oct 12, 2015 at 13:03 Mu Li notifications@github.com wrote:
what's the example we are going to show lstm? i vote for char-lstm, char-level language model. https://github.com/karpathy/char-rnn
— Reply to this email directly or view it on GitHub https://github.com/dmlc/mxnet/issues/206#issuecomment-147492569.
have you tried gradient clipping? it often helps a lot for rnn
I tested in various clip / norm setting, no effect On Mon, Oct 12, 2015 at 13:16 Mu Li notifications@github.com wrote:
have you tried gradient clipping? it often helps a lot for rnn
— Reply to this email directly or view it on GitHub https://github.com/dmlc/mxnet/issues/206#issuecomment-147495271.
Would you please provide an example similar to @Russell91's apollocaffe based LSTMSummation which is more like a hello world of LSTM. It's much easier to debug and will get merged into the master branch sooner. Thanks a lot!
@futurely Thanks for your suggestion. I think I have located the bug and will fix in today or tmr
Great! May I further request an implementation of Bidirectional LSTM? Many papers have shown that it outperformed LSTM in language modelling, speech recognition, OCR and some other applications. There are implementations in imperative programs but almost none in symbolic programs.
@futurely Definitely. There will be bi-direction, IRNN, GRU and LSTM examples
I was wondering where I could find the GRU examples?
@futurely Do you have Bidirectional LSTM example?
No MXNet version. https://github.com/search?utf8=%E2%9C%93&q=bidirectional+lstm