dmlc / MXNet.jl

MXNet Julia Package - flexible and efficient deep learning in Julia
371 stars 70 forks source link

Julia-based definition of new layer #39

Open nowozin opened 8 years ago

nowozin commented 8 years ago

Hi,

I would like to define a new layer similar to dropout, using random number generation. According to https://mxnet.readthedocs.org/en/latest/tutorial/new_op_howto.html it is currently possible to define layers using Python.

Is it possible to define new layers using Julia, and if so, could you provide a minimal example of a Julia-defined layer? (For example, something as simple as scaling the input.)

Thanks, Sebastian

pluskid commented 8 years ago

@nowozin Unfortunately it is currently not possible yet. Because Julia GC is not thread safe currently, so that makes things a lot complicated than in the Python case. We are still working on trying to come up with a solution yet.

nowozin commented 8 years ago

Thank you for the quick reply and clarification.

A related question regarding the mx.randn functionality: I would like to generate random noise within a neural network, ideally so that the output of a SymbolicNode is random each time it is evaluated. Would I have to write a new layer for this, or is there already functionality to have such random nodes? It seems that mx.randn is a low-level interface that cannot be used when setting up a symbolic computation graph.

(This would be useful for variational Bayes neural network training objectives, e.g. (Blundell et al., 2015) and also described here.)

vchuravy commented 8 years ago

@nowozin I am working on this and the non functional PR is at #16

pluskid commented 8 years ago

There is currently no rand number generator symbolic node. But a workaround is to treat those random numbers as input data and generate from a data provider. For example, the lstm chat-rnn example uses a customized data provider to generate all-zero matrices.

nowozin commented 8 years ago

Thank you so much for the quick and informative replies, much appreciated.

Andy-P commented 8 years ago

@pluskid I would like to implement a custom layer; MDN loss layer There are two examples using Python to define layers. One is in pure Python. The other is using NDArray (MXRtc). Is it possible to do the second of those (i.e. using NDArray) from Julia?

vchuravy commented 8 years ago

Yes with NDArray it should be possible, I just have been quite busy the last two weeks and haven't gotten any coding done. Getting this to work is on the top of my priority list for the weekend.

Andy-P commented 8 years ago

Any progress on using NDArray + Julia to define layers?

vchuravy commented 8 years ago

I made progress today, but it still has some way to go, but help is always welcome.

Andy-P commented 8 years ago

@vchuravy 3 questions

  1. Is this the branch your using to work on this? https://github.com/vchuravy/MXNet.jl/tree/vc/nativeops/src
  2. How can I help push this along?
  3. Is there anything external holding it up? i.e. need the MxNet team to make some change to the main library?
vchuravy commented 8 years ago
  1. Yes that is the branch I have been working on
  2. The branch has a small example in there and the best thing to do is to execute that and see it crash. Then look into where the issues are coming from.
  3. There is one immediate issue that comes to mind and that is the livetime of objects in the mxnet side of things. Everything that is passed to Julia needs to be allocated on the heap and then free'd later on. Otherwise we will access data that is no longer valid. Python gets away with this because everything is synchronous.

I will see if I can devote some more time to this, but work gets in the way right now.

vchuravy commented 8 years ago

I hope that I can make another push for this during the coming week. There are a few issues I have a handle on and one that I am not sure about. It requires another serious push and so for me it is a time thing. But I am in a similar position to you.

Keep in mind that this only gets us a CPU implementation and I haven't looked at how the Python interface handles GPU (I know that it does), so that might be something worthwhile to check out.

On Mon, 15 Feb 2016, 00:01 Andre Pemmelaar notifications@github.com wrote:

@vchuravy https://github.com/vchuravy @pluskid https://github.com/pluskid

I spent some time this past week trying to debug this. Unfortunately, my lack of experience with c++ has made progress very slow.

I have spent a fair amount of time testing MXNet.jl and it is easily the best solution for Julia + Deep Learning if you can get away with using the current native loss functions. Unfortunately for me, this is not the case. So this issue of a custom layer has become THE issue for me. It is basically the only thing standing in the way of my using MXNet.jl for serious work.

So I need to ask, is this something that will likely take a long time to solve either because it is actually quite difficult, or because no one who has the skills has the time? Or is it just one of those things that just needs a little extra love? I really can't tell.

If it seems unlikely to be solve soon that is fine. I will look for another solution and come back to MXNet later. On the other hand, if it is just around the corner I will hold off investing time in a different framework and instead spend my time contributing where I can.

— Reply to this email directly or view it on GitHub https://github.com/dmlc/MXNet.jl/issues/39#issuecomment-183903342.

CarloLucibello commented 8 years ago

+1 for this, I'm really looking forward for it