HUBioDataLab / SELFormer

SELFormer: Molecular Representation Learning via SELFIES Language Models
74 stars 14 forks source link

next step, how to utilize for design #3

Closed jessielyons closed 1 year ago

jessielyons commented 1 year ago

Hello,

very nice publication and work. can you please guide if I were to use your pre-training and fine-tuning model for generating new molecules, how would i go about it?

would i just use pre-training model and use another method for design? i have my own dataset of structure and its activity and i want to generate new molecules. i am very new leaner in this ml field. really appreciate your guidance, JL

tuncadogan commented 1 year ago

Hi, thank you for your interest in SELFormer. There could be a variety of ways to generate new molecules using SELFormer (or another methods' molecular embeddings). You could plug our pre-trained model to a transformer decoder and train it to generate molecules. Or you could train a VAE that takes our embeddings as input and generate molecules based on the learned latent space, or you could train a GAN or a diffusion model in a similar way. We also have a molecule generation model please take a look at it here (it does not use SELFormer embedding though, it works directly on molecular graphs): https://github.com/HUBioDataLab/DrugGEN