EderSantana / seya

Bringing up some extra Cosmo to Keras.
Other
377 stars 103 forks source link

Fix DRAW example #29

Open brunodoamaral opened 8 years ago

brunodoamaral commented 8 years ago

The DRAW example seems to use an old Keras API. I tried to update it (I was planning a PR), but I was unable to understand how the code below works:

model = Graph()
model.add_input(name='input', input_shape=(1, 28, 28))
model.add_input(name='noise', input_shape=(n_steps, z_dim))
model.add_node(draw, name='draw', inputs=['input', 'noise'], merge_mode='join')

I couldn't figure out what the parameter merge_mode='join' does (the only valid modes here are 'sum', 'mul', 'concat', 'ave', 'cos' or 'dot').

I tried 'concat', but got the error Exception: "concat" mode can only merge layers with matching output shapes except for the concat axis. Layer shapes: [(None, 1, 28, 28), (None, 64, 100)]

Any plans to update this example? Or if you have any instructions on how to fix I'll be glad to help. Thanks!

brunodoamaral commented 8 years ago

I forget to say the library versions: Keras = 1.0.2 Theano = 0.9.0dev0.dev-beefa9396a0e089f6b35b43f71c05557b1083515

EderSantana commented 8 years ago

hi @brunodoamaral , tkx for helping Seya. I am updating seya in the branch keras1 you can contribute your changes there https://github.com/EderSantana/seya/tree/keras1

"join" was an old merge mode, it just put two elements in a list together. You can try to flatten both vectors and "concat" them. but you would have to slice and reshape inside the DRAW layer. if this does not seem too bad for you, it should probably be the easiest way to do it

the original motivation to do "join" at all, is because keras did not let us update random number generators that go inside theano scan loops. I don't know if that is still the case. If we can pass the theano scan update to the compile function, there is no need to pass the random numbers as input "noise"

brunodoamaral commented 8 years ago

Hi @EderSantana! I'm looking at this issue as suggested in your second option (I'm using T.shared_randomstreams.RandomStreams to generate the normal distribution). This way, the input of the DRAW layer would be a simple image batch of shape (channels, width, height).

But the most difficult part is to fit into the new keras-1 API. There are a lot of changes (I already did a some of them, but stuck in a "shape hell" of Theano). Once I get outside of it, I'll push the changes into my fork so you can have an ideia of the modifications.

brunodoamaral commented 8 years ago

Hi @EderSantana! Just an update on the issue I opened: I made some progress in making the DRAW layer work, though not yet functional. I tried to refactor it in a way to avoid accessing the internal states of encoder/decoder (like methods _get_rnn_input and _get_rnn_state), but was unable to do it. I read a few discussions on Keras github, some of then with your participation, and I noticed that the API is still unable to perform more complex tasks on RNNs. Probably we may really need to access the internals of GRU/LSTM to make DRAW layer work.

EderSantana commented 8 years ago

back when I first wrote Neural Turing Machines I had a copy of the LSTM on the file. Maybe we can just do that