maxhodak / keras-molecules

Autoencoder network for learning a continuous representation of molecular structures.
MIT License
519 stars 146 forks source link

Incorporate Grammar VAE #62

Open pechersky opened 7 years ago

pechersky commented 7 years ago

A new paper came out: https://arxiv.org/abs/1703.01925

They were able to implement a VAE that autoencodes the SMILES grammar as a context-free grammar, which is a pretty good approximation. They improved over the GB SMILES VAE. This would solve several of the issues that have been filed such as #31 #54.

The Grammar VAE code is at https://github.com/mkusner/grammarVAE. It looks a lot like the VAE code we already have, in terms of the actual model code. Incorporating the zinc_grammar and the masking shouldn't be too difficult -- just some work to implement the masking in Theano.

In fact, their code also uses mean and not sum in the KL loss term, considering #59.