Rose-STL-Lab / LIMO

generative model for drug discovery
58 stars 14 forks source link

About the sampling process #23

Closed tszslovewanpu closed 3 months ago

tszslovewanpu commented 3 months ago

Hello, Peter! Me again~ Here you mentioned:

A. Experiment description and baselines A.1. Tasks Random generation of molecules: Generate random molecules by sampling from the latent space

How to sample from the latent space? Is it like sampling random noise from the normal distribution? Thanks!

PeterEckmann1 commented 3 months ago

Hi, sampling from the latent space just means sampling random gaussian noise. For example, the starting point during the optimization process starts with random samples:

https://github.com/Rose-STL-Lab/LIMO/blob/dc55c299010c62a9a8a3b5acdfc86ba50500256d/generate_molecules.py#L25

So if you want to sample random molecules from the latent space you can just skip the optimization step and use just the gaussian noise.

tszslovewanpu commented 3 months ago

Got it~ Thanks bro! ^v^ ❤️

PeterEckmann1 commented 3 months ago

No problem!

tszslovewanpu commented 3 months ago

Hi~ I have another question. I found that the sampling size(when do inference) for the multi-objective Binding Affinity Maximization task is 100K. Could you please tell me the sampling sizes for the logP Targeting task and the substructure-constrained logP extremization task? Thank you!

Best wishes Fan

PeterEckmann1 commented 3 months ago

I believe the sampling size for logP targeting is also 100k. For substructure-constrained logP extremization, there's not really samples, we just take two ZINC molecules and perform extremization on those two compounds.