apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.76k stars 6.8k forks source link

[Keras] Support MXNET backend for Keras. #4173

Closed shivarajugowda closed 7 years ago

shivarajugowda commented 7 years ago

I have been working on supporting MXNET as a backend for Keras(A popular neural networks Python library which currently supports TensorFlow or Theano). I am hopeful the endeavor is win-win for both projects. Keras will benefit from the MXNET’s multi-device/multi-node support and MXNET will get broader exposure. The task will also test out and most probably enhance MXNET API capabilities for a broader set of audience.

To this end, I have started the process and am able to get low-hanging APIs checked off. I could say I have 25% of the work done. The amount of changes required in Keras is not huge we just need to add one more file on the same lines as of tensorflow_backend.py I think most of the work would be in figuring out how to map the functionalities to MXNET APIs and implementing the missing ones.

The backend tests are a good way to go about implementing it and tracking our progress.

Here is my rough current status as measured in terms of those tests.

APIs converted:

APIs I am currently working on.

APIs I think might need changes/updates to MXNET.

Things that are currently missing but are nice to have:

First things first, I want to know if this is inline with the MXNET community’s needs and something which you agree needs to be pursued and is worthwhile. If we agree then I can use the issue as a high-level task to track and update my progress and also request more info/features and help with implementing them.

piiswrong commented 7 years ago

@shivarajugowda Thanks for the wonderful effort! Yes this is something we want. In fact we have been talking about doing this for a while but didn't have the man power.

Could you post your code somewhere so we can track progress and other people interested in this can pitch in?

piiswrong commented 7 years ago

Also please work with the nnvm branch (instead of master) since it will be released soon.

with nnvm branch all functions are available in both symbol and ndarray.

shivarajugowda commented 7 years ago

@piiswrong Good to know this would be of value. Yes I will start checking in (whatever I have), but all the changes as of now are in Keras branch. I will point to it in here once I checkin(by Monday). As and when I need changes to the MXNET I will create a separate issue and summarize it here. Also appreciate pointing out to use NNVM branch, I was using the master branch till now. I will check out the new symbol in NNVM branch.

ivenzor commented 7 years ago

Good work!

anjishnu commented 7 years ago

+1, great work. Would love to see this happen.

jspisak commented 7 years ago

This is really great! Thanks @shivarajugowda for jumping in on this..

shivarajugowda commented 7 years ago

Here is where you can monitor the progress on the Keras end.

Branch : https://github.com/shivarajugowda/keras Issue: https://github.com/fchollet/keras/issues/1313

shivarajugowda commented 7 years ago

Adding support of element wise !=, >, >=, <, <=comparison operators as a part of this. #4182

shivarajugowda commented 7 years ago

For Keras, we need support for Sparse Matrices, I see a proposal in #1524. Not sure how far off we are in terms of progress. For the time being I am using the Dense matrix underneath.

shivarajugowda commented 7 years ago

Filed #4248 and #4249 to support mean and std deviation operators.

fchollet commented 7 years ago

For Keras, we need support for Sparse Matrices

If you leave out sparse tensor support, very little functionality would be lost. As long as you raise appropriate, helpful exceptions in the backend, it would be fine.

Do you foresee any issues with K.rnn, K.gradients? These two would be the tricky ones.

shivarajugowda commented 7 years ago

@fchollet thanks for the input, I am figuring out loop and if constructs in the context of RNN. Apart from MXNET, I am also looking at Tensorflow and Theano code. Will get more time during the Christmas break, will keep the updates posted here.

piiswrong commented 7 years ago

gradient is in there (gradient pass) but not exposed through c api. @tqchen @jermainewang Is it possible to expose it?

tqchen commented 7 years ago

There are essentially two approaches possible. Keras takes a pure declarative symbolic approach for both network definition and parameter update, because the existing frameworks on keras works are declarative.

However, it is not necessarily to be so in MXNet, and the parameter update part is automatically handled with imperative code. They do not necessarily be incompatible with keras API, most of keras API are for the network definition generations.

So I would suggest the follow approach

I know this may take a bit of additional effort, but it also takes benefit from mutli GPU API available in mxnet module.

As a second approach, we can reuse the gradient pass in mxnet and take a purely declarative approach, which I view will take a bit more effort, and may not directly come with multi-gpu support.

tqchen commented 7 years ago

My comment do not be block any of the existing issues, but instead break the goals into two parts (and two layers of compatibility)

I am all for both directions, but break it into two part will make the milestone easier, I am in favor of quickly achieving 1, so everything is functioning, and possibly stabbing 2 later.

shivarajugowda commented 7 years ago

@tqchen I am probably not familiar with the terminology used in the MXNET context. I could follow some broader thought but I couldn't follow all of the details. @tqchen, @piiswrong How about a Webex/Hangout to go over this and also validate that I am in the right direction. Let me know and I can setup one if and when you have some time.

piiswrong commented 7 years ago

@shivarajugowda Yes we should have a meeting. What time zone are you in?

shivarajugowda commented 7 years ago

@piiswrong I am in Pacific Time Zone(CA, bay area). I am available anytime tomorrow.

piiswrong commented 7 years ago

my email is eric.jy.xie@gmail.com

On Tue, Dec 20, 2016 at 8:29 AM, Shiv Gowda notifications@github.com wrote:

@piiswrong https://github.com/piiswrong I am in Pacific Time Zone(CA, bay area). I am available anytime tomorrow.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dmlc/mxnet/issues/4173#issuecomment-268288881, or mute the thread https://github.com/notifications/unsubscribe-auth/AAiudJkak1plQmybnCocMGQPLS2jzR9Iks5rKAJtgaJpZM4LJJ3K .

shivarajugowda commented 7 years ago

I have integrated conv2d and pool2d. MXNET is missing support for 3D Convolution (#4301) and Pooling with 3D kernel.

shivarajugowda commented 7 years ago

NDArray.onehot_encode() only support indices in 1D. We need support for multiple dimension.

shivarajugowda commented 7 years ago

Update: I have mapped a few more operators(2D convolution/pooling, map_fn, foldl, foldr, etc) and I am pursuing @tqchen and @piiswrong suggestion of "Compatibility of network graph generation API(without requirement of gradient, and scan) The model fitting logic swapped by module API in mxnet, with API compatibility" for a simple example of "Keras/keras/examples/mnist_mlp.py". The operators are mapped for this example and I am working on using the MXNET module api underneath the Keras Model.fit().

imranshaikmuma commented 7 years ago

is this still open? what is the progress? does keras has api for mxnet now? i dont see in keras documentation. I like MXNET context thing!!! please let me if it is available on keras through mxnet backend

shivarajugowda commented 7 years ago

This issue can be closed now, the dmlc folks have a fork of Keras working with MXNET. https://github.com/dmlc/keras https://medium.com/@julsimon/apache-mxnet-support-in-keras-83de7dec46e5

imranshaikmuma commented 7 years ago

i am getting the following error: image