microsoft / molecule-generation

Implementation of MoLeR: a generative model of molecular graphs which supports scaffold-constrained generation
MIT License
258 stars 42 forks source link

Optimising latent vectors for objective #64

Closed linminhtoo closed 10 months ago

linminhtoo commented 11 months ago

Hello,

I was wondering if you have any examples of doing the multiple swarm optimisation with latent vectors for an arbitrary multiobjective optimisation task.

I can't seem to find this in either the README or the code.

Best, Min Htoo

kmaziarz commented 11 months ago

No, sorry, we never got around to fully releasing the MSO component. That being said, we used the original code with minimal modification, mostly just replacing CDDD with MoLeR.

For best results, we had to make one adjustment in what range of the latent space is explored: for CDDD, the original MSO authors used a hypercube (i.e. clipping the latent vector to [-1, 1] on each dimension separately), while for MoLeR we found that using a hyperball (i.e. clipping the norm of the latent vector) worked better, which makes sense as MoLeR was trained with a VAE prior. This is described in Appendix C.3 in our paper. However, results should already be non-trivial even without this modification. I hope this helps!

linminhtoo commented 11 months ago

No, sorry, we never got around to fully releasing the MSO component. That being said, we used the original code with minimal modification, mostly just replacing CDDD with MoLeR.

For best results, we had to make one adjustment in what range of the latent space is explored: for CDDD, the original MSO authors used a hypercube (i.e. clipping the latent vector to [-1, 1] on each dimension separately), while for MoLeR we found that using a hyperball (i.e. clipping the norm of the latent vector) worked better, which makes sense as MoLeR was trained with a VAE prior. This is described in Appendix C.3 in our paper. However, results should already be non-trivial even without this modification. I hope this helps!

hi Krzysztof, thanks for this, I missed that appendix part when reading the paper.

for clarity, could you share some snippets of code on how these 2 changes were implemented please?

thanks a lot, Min Htoo

kmaziarz commented 11 months ago
  • "for all of our optimization experiments we made the MoLeR encoder deterministic by always returning the maximum likelihood latent code z" i encoded the same SMILES twice and compared the output latent vectors, they are identical. so it appears that this change has been made in the code already

Yes, by default we simply return the mean of the latent encoding (see docstring of the encode method in VaeWrapper). If one wants to perform stochastic encoding, they can pass include_log_variances=True, but without this, just the mean is returned.

  • for MoLeR we clip to a ball of fixed radius R = 10. is this just self.x = self.x / (np.linalg.norm(self.x, axis=1, keepdims=True) / 10)

Almost - your code would project everything onto a sphere of radius 10, whereas we only do clipping, i.e. vectors with too large norm are projected while others are left unchanged. It would have to be something like

norms = np.linalg.norm(self.x, axis=1, keepdims=True)
self.x *= np.minimum(norms, 10.0) / norms

That being said, it could just as well be the case that projection is as good as clipping - I can't recall if we tried projecting.

linminhtoo commented 10 months ago

hey @kmaziarz I'm very sorry for my late reply as I was on an extended holiday

your recommendations worked perfectly, and it's working really well for us. thanks a lot!

this was actually my last project at my previous startup, so i'm glad it all worked out. I look forward to your work in the future as molecular generation is something I'm really passionate about, along with chemical retrosynthesis (having studied chemistry for undergrad and then working as an AI engineer professionally). Just saw that you and your team recently released the Syntheseus repo too.

Cheers.