Closed xuzhang5788 closed 6 years ago
Hmmm, you're referring to z1 on line 48 of model_zinc.py right?:
vae_loss, z1 = self._buildEncoder(x1, latent_rep_size, max_length)
The second argument returned by buildEncoder should be a sample... Could you share your code?
I used your grammarVAE/encode_decode_zinc.py to try out.
Ah right, that's because on line 84 grammarVAE/molecule_vae.py returns the mean of the encoder:
return self.vae.encoderMV.predict(one_hot)[0]
If you change that line to:
return self.vae.encoder.predict(one_hot)
Then you will have stochastic encoding that the model is trained with.
Thank you very much.
Because the algorithm is stochastic, I think the mean of the encoder should be stochastic with small variances. But I got a constant encoding code. Can you explain it? Thanks a lot.
The encoder mean is not stochastic, for a fixed set of weights, it is fixed for each input. If however, you sample using the above code I showed, you will sample from a Gaussian with that mean, and your samples will be stochastic.
Thank you so much. Do you have any plans to create a tutorial for people like me who wants to use your pretrained models to generate new molecules surrounding a existed molecule with some kind of noises or between two molecules or among several molecules? I found some difficulties to use your codes. Many thanks.
Ah sorry I don't have a plan at the moment. But I would be happy to have a look over a blog post about it if you wrote one! Sorry for the confusion. Thanks for your interest in the project!!
In your paper, you said " For each molecule we encode it 10 times, and we decode each encoding 100 times (as encoding and decoding are stochastic). This results in 1000 decoded molecules for each of the 5000 input molecules"
I understand that decoding is stochastic, but when I encoded several times for a molecule, I got the same encoded code z1. When I used pretrained weights, I think encoding is not stochastic. Am I right?