Open albertma-evotec opened 5 years ago
Hi, I have the same questions. and I was wondering if you have solved the problem yet. Looking forward to hearing from you. Many thanks!
Hi, I have the same questions too because I have read parts of the code in gentrl.py.
I am looking forward to hearing from you. Many thanks
According to one of the authors, no SOM code have been or will be provided.
I have a similar question too
You can try any SOM.
This is a good example in pytorch. https://github.com/Dotori-HJ/SelfOrganizingMap-SOM
Any progressions on this issues? Looking forward to your updates
Hi, I want to tell everyone in this thread I have transformed this model into a Pytorch Lightning module for Multi GPU support. Please check it out here
It will increase your efficiency of running it on multi-GPU and even a single GPU.
Please check it and if there are any bugs raise an issue so that I can improve the codes.
@Bibyutatsu Thank you for your implementation. I tried it, but I still need to install pytorch-lighting separately.
I think you are very familiar with this repo. So, may I ask you some questions about sampling? In this repo, you generated new molecules from random latent points. But in the paper, they showed a parent molecule and then generate new molecules. I don't know how to do this. For example, I have a parent molecule, I want to generate similar molecules around this parent molecule. I think you are also aware of chemvae (https://github.com/aspuru-guzik-group/chemical_vae/tree/master/chemvae). They gave an example(https://github.com/aspuru-guzik-group/chemical_vae/blob/master/examples/intro_to_chemvae.ipynb).
Your advice is highly appreciated.
@xuzhang5788 Yeah currently generating new molecules referencing a parent molecule is not directly supported. But for that you can follow these steps:
mean
and log_stds
mean
and log_stds
so you need to make a custom function in LP
module which can take means
and log_stds
as input and sample from there.decoder.sample
on these sampled points.I will try to incorporate it in code to show you till then I hope these pointers can help you.
Also, you can look at this repo. It does exactly what you were trying to do and also uses VAE like chemvae
.
Hi, I want to tell everyone in this thread I have transformed this model into a Pytorch Lightning module for Multi GPU support. Please check it out here
It will increase your efficiency of running it on multi-GPU and even a single GPU.
Please check it and if there are any bugs raise an issue so that I can improve the codes.
Hi @Bibyutatsu, the link you provided cannot be accessed.
And do you known how to use SOM to calculate the reward? Perhaps if the generated structures are mapped in the same grid with the DDR1 inhibitor molecules, there will be a positive reward, otherwise, a negative reward?
Thank you!
Hi, I am quite new in this area (both python and AI) sorry if my questions are too stupid. I read that you used six data sets to build the model. According to your paper, the first is "a large set of molecules derived from a ZINC data set". Is it referring to the following in pretrain.ipynb?
But why does the quantity of compound in that csv differ from the figure that reported in the supplementary table (Table 1)
If i am working on another biological target, in what stage I need to continue the training with datasets for that target? Should I repeat the model.train_as_vaelp() method with another train_loader created from my own target-specific dataset?
In the next step, train_rl.ipynb, Does it mean it is just an example to train the model to generate compound with high penalised logP? I assumed it has nothing to do with the SOMs? In your paper, you mentioned you use three SOMSs as reward functions, so I need to def my own scoring functions here? Is there specific module I need to install if I want to write my own SOM rewarding functions?
I am looking forward to hearing from you. Many thanks