Rose-STL-Lab / LIMO

generative model for drug discovery
59 stars 14 forks source link

About Inference #19

Closed tszslovewanpu closed 4 months ago

tszslovewanpu commented 4 months ago

The first stage is the training of the VAE. The second stage is the training of the property predictor, 1、Should the third stage be viewed as the inference process? 2、Does LIMO generate molecules in this manner? a. Get 1 input molecule from ZINC250k or MOSES, LIMO gets the lantent ( z ) b. LIMO does 10 times optimization of ( z ) c. LIMO uses the optimized ( z ) to generate optimized molecules

Thank you very much!

PeterEckmann1 commented 4 months ago

Thanks for your interest!

  1. Yes, I would call the third stage the inference process, because we are no longer adjusting any model weights during that stage. However, we are still performing backpropagation to optimize z, so it might make a bit more sense to call this stage "generation" instead.

  2. That is almost accurate, except for (a) we initialize the latent z from the normal distribution (instead of getting an input molecule from a dataset). z is initialized here: https://github.com/Rose-STL-Lab/LIMO/blob/main/generate_molecules.py#L25

Let me know if you have any more questions.

tszslovewanpu commented 4 months ago

Thank you, i still have some doubts i、For each optimization task, LIMO generates 10k molecules in the above mannel(start with ( z ) sampled from the normal distribution), so does each task has its own encoder(actually useless, just for training the decoder), decoder, and property prediction model? ii、For the random generation task, does LIMO just sample ( z ) and put it to the decoder to generate molecules? Sorry to bother you with such a naiive question, thanks again!

PeterEckmann1 commented 4 months ago

No problem!

  1. That's right, we have to train the encoder, decoder, and property predictor (although we can reuse the encoder and decoder across different properties). And yes, the encoder is only useful for training the decoder, we don't use it for anything else.
  2. Yes, we just sample z and then decode it.
tszslovewanpu commented 4 months ago

Got it, thank you!