apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

Recent RoadMap #206

Closed antinucleon closed 8 years ago

antinucleon commented 9 years ago
pluskid commented 9 years ago

What is cusolver referring to?

antinucleon commented 9 years ago

SVD stuff

pluskid commented 9 years ago

I see. Thanks!

loveisp commented 9 years ago

I am interested in the part of "more visualization". How is the progress now? What methods are you ready to implement? Maybe I also could contribute some code.

antinucleon commented 9 years ago

@loveisp Great thanks Dai! I have done nothing yet. You are more than welcome to PR to https://github.com/dmlc/mxnet/blob/master/python/mxnet/visualization.py . Currently I think we need some functions to visualize filer, feature map and rnn structure. If you have more great ideas, feel free to PR!

loveisp commented 9 years ago

@antinucleon OK, I will try to PR:)

mli commented 9 years ago

what's the example we are going to show lstm? i vote for char-lstm, char-level language model. https://github.com/karpathy/char-rnn

antinucleon commented 9 years ago

I have made a PTB example https://github.com/antinucleon/mxnet/blob/master/example/LSTM/PennTree.ipynb , however the behavior is abnormal when compared to torch.

At first 0.8 round, it is almost same to torch one, then torch make perplexity from 700 to 300 in the last 0.2 of the first round, but ours reduce from 700 to 670.

I spent a whole day on it and no luck to find out the reason. On Mon, Oct 12, 2015 at 13:03 Mu Li notifications@github.com wrote:

what's the example we are going to show lstm? i vote for char-lstm, char-level language model. https://github.com/karpathy/char-rnn

— Reply to this email directly or view it on GitHub https://github.com/dmlc/mxnet/issues/206#issuecomment-147492569.

mli commented 9 years ago

have you tried gradient clipping? it often helps a lot for rnn

antinucleon commented 9 years ago

I tested in various clip / norm setting, no effect On Mon, Oct 12, 2015 at 13:16 Mu Li notifications@github.com wrote:

have you tried gradient clipping? it often helps a lot for rnn

— Reply to this email directly or view it on GitHub https://github.com/dmlc/mxnet/issues/206#issuecomment-147495271.

futurely commented 9 years ago

Would you please provide an example similar to @Russell91's apollocaffe based LSTMSummation which is more like a hello world of LSTM. It's much easier to debug and will get merged into the master branch sooner. Thanks a lot!

antinucleon commented 9 years ago

@futurely Thanks for your suggestion. I think I have located the bug and will fix in today or tmr

futurely commented 9 years ago

Great! May I further request an implementation of Bidirectional LSTM? Many papers have shown that it outperformed LSTM in language modelling, speech recognition, OCR and some other applications. There are implementations in imperative programs but almost none in symbolic programs.

antinucleon commented 9 years ago

@futurely Definitely. There will be bi-direction, IRNN, GRU and LSTM examples

zachmayer commented 8 years ago

I was wondering where I could find the GRU examples?

xlvector commented 8 years ago

@futurely Do you have Bidirectional LSTM example?

futurely commented 8 years ago

No MXNet version. https://github.com/search?utf8=%E2%9C%93&q=bidirectional+lstm

futurely commented 8 years ago

Your PR: https://github.com/dmlc/mxnet/pull/2096