CDDLeiden / PCMol

Multi-target de novo molecular generator conditioned on AlphaFold's latent protein embeddings.
MIT License
51 stars 4 forks source link

How to generate new molecules using a new protein (not included in the targets.txt file)? #4

Closed MachineGUN001 closed 1 week ago

MachineGUN001 commented 2 weeks ago

hi,

wonderful work!

According to the instructions in the README, can the target in "python pcmol/generate.py --target P29275 --device cpu" only be those included in the data/targets.txt file? If I want to try a new protein that is not in the aforementioned targets.txt file, how should I train the model or directly use a pre-trained model for molecule generation?

many thanks,

Best,

MachineGUN001 commented 1 week ago

I've found that many targets, even those listed in the targets.txt file, do not download successfully. For example, "Downloading AlphaFold embeddings for O14744 to, Files for protein O14744 could not be downloaded. 404 Client Error: Not Found for url: https://surfdrive.surf.nl/files/index.php/s/Gqy5vPOYJUHVaU7/download?path=%2F&files=O14744.zip." Protein embeddings not found...

andriusbern commented 1 week ago

Hi,

1) For the time being the usage of the PCMol model is restricted to this particular set of proteins. We did not provide the code for generating embeddings for other targets. Main reason - it involves modified AlphaFold v2.2.0 code, along with a long setup procedure that we have not tested on different systems.

2) Thanks for bringing this up, there seems to be some kind of an issue with the online embedding storage, we are working on fixing it.