maxhodak / keras-molecules

Autoencoder network for learning a continuous representation of molecular structures.
MIT License
519 stars 146 forks source link

Try Gomez-Bombarelli's TerminalGRU #42

Open pechersky opened 7 years ago

pechersky commented 7 years ago

I was looking to see whether the paper's authors had any draft code for the model they describe, to check whether they did something differently. I found that R GB had implemented some new RNN layers they might have used for the decoder. I don't know if they ended up using the conventional GRu or this new TerminalGRU instead. Might be worth trying out:

https://github.com/fchollet/keras/compare/master...rgbombarelli:master#diff-3118e4e28157032506f771f279a551c3R639

maxhodak commented 7 years ago

Hey I'm finally back in CA and have fixed my DL machine. This is an interesting idea... I'll get to this in the next couple days!

hsiaoyi0504 commented 7 years ago

Maybe this is what we expect fchollet/keras/#694 and fchollet/keras/#3947.

rgbombarelli commented 7 years ago

That layer is quite helpful - each neuron gets to see not only the probability distribution, but also what character got sampled. In a perverse case, say you have 51-49 and 49-51 chances of opening a bracket "(" in two successive characters, without teacher forcing, you end up with "((" 25% of the time.

It's essentially teacher forcing (I didn't know it was called that when I wrote the TerminalGRU code. Here is a modern way to do it. seq2seq also has many useful tools around RNN's and might be a good place to start from. We also released the VAE from the paper here in case it's useful.

hsiaoyi0504 commented 7 years ago

Wow, @rgbombarelli Is it the official release of implementation of your original paper?