how to sample from the generative model

Hello,

I'd to sample a batch of molecules from a pretrained GrammarVAE. Using encode_decode_zinc.py as inspiration, I first loaded the grammar_weights and grammar_model. I then sample from a standard Normal and then call the decode function on the grammar model using a sample.

grammar_weights = "pretrained/zinc_vae_grammar_L56_E100_val.hdf5"
grammar_model = molecule_vae.ZincGrammarModel(grammar_weights)
latent_rep_size = 56
epsilon_std = 1.0
batch_size = 1000
prior_z_samples = np.random.normal(loc=0.0, scale=epsilon_std, size=(batch_size, latent_rep_size))
decoded_samples = []
for i in range(batch_size):
    decoded_samples.append(grammar_model.decode(prior_z_sample[i][None,:])[0])

Does this seem like a reasonable way to get samples from the generative model? The code above runs but the outputted molecules seem off. For example, if I plot the empirical distribution of QED scores using the sampled molecules to the empirical distribution of QED scores from the zinc dataset, the empirical distribution from GrammarVAE is highly overdispersed and on average has a lower QED score.

mkusner / grammarVAE

how to sample from the generative model #22