Open amkrishnan28 opened 1 month ago
pretrained_vae_model.pt should be model.pt file.
I get the same error replacing
pretrained_vae_model.pt should be model.pt file.
I believed that it was something wrong about SMILES recognizing module. In my case, I also got a lot error message about smiles, it seems that it doesn't recognizing some special atoms. But eventually I got the output. But not exactly the same as Supplementary Fig 6
I believed that it was something wrong about SMILES recognizing module. In my case, I also got a lot error message about smiles, it seems that it doesn't recognizing some special atoms. But eventually I got the output. But not exactly the same as Supplementary Fig 6
How did you do this?
Hello,
This is the standard output sent to the terminal by the 'polygon generate' command with the "--debug" flag set. This is a very verbose output setting, detailing the process of generation.
These error are seeing are being produced by the "rdkit.Chem.MolFromSmiles" function that parses the smiles strings generated by sampling the latent space of the chemical embedding. This latent space is not perfectly "continuous", rather, there are some position in the chemical embedding that cannot be translated in valid smiles strings. RDkit is alerting for all the generated structure which could not be parsed into valid molecules from the decoded SMILES string.
However, I would not expect this to terminate the generation run. Did this command produced output files? Did you inspect those?
If you would like to not see these SMILES parsing errors try changing the "--debug" to "--verbose".
Best, Brenton
Hi, I'm trying to run the command: polygon generate \ --model_path ../data/pretrained_vae_model.pt \ --scoring_definition scoring_definition.csv \ --max_len 100 \ --n_epochs 200 \ --mols_to_sample 8192 \ --optimize_batch_size 512 \ --optimize_n_epochs 2 \ --keep_top 4096 \ --opti gauss \ --outF molecular_generation \ --device cpu \ --save_payloads \ --n_jobs 4 \ --debug The first three commands work, but when I try to run this command, I get this error:)c2C1
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'O=C(c1cn(-c2ccccc2)nc1-c1cccc(F)c1)N1CCC(C2)c2cc(O)c(O)c(C)c2C1' for input: 'O=C(c1cn(-c2ccccc2)nc1-c1cccc(F)c1)N1CCC(C2)c2cc(O)c(O)c(C)c2C1'
[11:46:29] SMILES Parse Error: syntax error while parsing: Cc1ccc(CNC23CC4CC(CC(C4)C2)C3)cc1
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'Cc1ccc(CNC23CC4CC(CC(C4)C2)C3)cc1' for input: 'Cc1ccc(CNC23CC4CC(CC(C4)C2)C3)cc1'
[11:46:29] SMILES Parse Error: syntax error while parsing: Cc1ccc(CNCCc2ccccc2C)nc1
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'Cc1ccc(CNCCc2ccccc2C)nc1' for input: 'Cc1ccc(CNCCc2ccccc2C)nc1'
[11:46:29] Can't kekulize mol. Unkekulized atoms: 17 18 19 20 25
[11:46:29] SMILES Parse Error: syntax error while parsing: COc1ccc2cc(C(=O)N3CCC(C(O)=NCc4cccc(C)c4)CC3)c(O)nc2c1
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'COc1ccc2cc(C(=O)N3CCC(C(O)=NCc4cccc(C)c4)CC3)c(O)nc2c1' for input: 'COc1ccc2cc(C(=O)N3CCC(C(O)=NCc4cccc(C)c4)CC3)c(O)nc2c1'
[11:46:29] SMILES Parse Error: syntax error while parsing: CC(=O)Nc1nc(C2CC2)c(C)c(-c2cccnc2Oc2ccc(C)cc2C)[nH]1
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'CC(=O)Nc1nc(C2CC2)c(C)c(-c2cccnc2Oc2ccc(C)cc2C)[nH]1' for input: 'CC(=O)Nc1nc(C2CC2)c(C)c(-c2cccnc2Oc2ccc(C)cc2C)[nH]1'
[11:46:29] SMILES Parse Error: syntax error while parsing: Cc1ccc([S]CCN2CCCCC2)cc1S(=O)(=O)N1CCCCCC1
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'Cc1ccc([S]CCN2CCCCC2)cc1S(=O)(=O)N1CCCCCC1' for input: 'Cc1ccc([S]CCN2CCCCC2)cc1S(=O)(=O)N1CCCCCC1'
[11:46:29] SMILES Parse Error: syntax error while parsing: CC(=O)Nc1ccccc1N1CCN(c2nc(Nc3ccccc3C)nc(N3CCCCC3)n2)CC1
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'CC(=O)Nc1ccccc1N1CCN(c2nc(Nc3ccccc3C)nc(N3CCCCC3)n2)CC1' for input: 'CC(=O)Nc1ccccc1N1CCN(c2nc(Nc3ccccc3C)nc(N3CCCCC3)n2)CC1'
[11:46:29] SMILES Parse Error: unclosed ring for input: 'COc1cc2c(cc1OC)C1=CC(O)=C3C(=O)CCC21'
[11:46:29] SMILES Parse Error: unclosed ring for input: 'Cc1ccc2nc(C)c3c(c2c1)C(=CCSc1nnc(C)s1)CC(C)(C)NC(=N)C2'
[11:46:29] SMILES Parse Error: syntax error while parsing: O=C(NCCc1cccc(C)c1)Nc1ccc2nc(C(F)(F)F)no2c1
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'O=C(NCCc1cccc(C)c1)Nc1ccc2nc(C(F)(F)F)no2c1' for input: 'O=C(NCCc1cccc(C)c1)Nc1ccc2nc(C(F)(F)F)no2c1'
[11:46:29] SMILES Parse Error: unclosed ring for input: 'COc1ccc(C2=C(C#N)C(c3c(-c4ccccc5)c[nH]c4c3C(C)CN3C(=O)C3CCC4C3C)CC2(C)C)cc1OC'
[11:46:29] SMILES Parse Error: syntax error while parsing: O=C1c2cc3c(cc2C(=O)N1C(=O)c1cccc(C)c1C)NC1CC3N1
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'O=C1c2cc3c(cc2C(=O)N1C(=O)c1cccc(C)c1C)NC1CC3N1' for input: 'O=C1c2cc3c(cc2C(=O)N1C(=O)c1cccc(C)c1C)NC1CC3N1'
[11:46:29] SMILES Parse Error: syntax error while parsing: Bc1cnc2ccc(N3CCNCC3)nn12
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'Bc1cnc2ccc(N3CCNCC3)nn12' for input: 'Bc1cnc2ccc(N3CCNCC3)nn12'
[11:46:29] SMILES Parse Error: syntax error while parsing: O=C(Nc1ccc(C)cc1)OCC1CN(c2ccncc2O)CCO1
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'O=C(Nc1ccc(C)cc1)OCC1CN(c2ccncc2O)CCO1' for input: 'O=C(Nc1ccc(C)cc1)OCC1CN(c2ccncc2O)CCO1'
[11:46:29] SMILES Parse Error: syntax error while parsing: CCOc1ccc(CNC(=O)C2CC2)cc1NC(=O)CC(=O)Nc1cc(C)c(OCC)cc1F
[11:46:29] SMILES Parse Error: Failed parsing SMILES 'CCOc1ccc(CNC(=O)C2CC2)cc1NC(=O)CC(=O)Nc1cc(C)c(OCC)cc1F' for input: 'CCOc1ccc(CNC(=O)C2CC2)cc1NC(=O)CC(=O)Nc1cc(C)c(OCC)cc1F'
Here is my scoring_definition.csv:
category,name,minimize,mu,sigma,file,model,n_top,agg
qed,qed,False,0.67,0.1,,,,
sa,sa,True,3,0.5,,,,
latent_distance,MTOR_vae_dist,True,1.5,0.5,../data/P42345_ligand_smiles_filtered.txt,../data/pretrained_vae_model.pt,20.0,mean
latent_distance,MEK1_vae_dist,True,1.5,0.5,../data/Q02750_ligand_smiles_filtered.txt,../data/pretrained_vae_model.pt,20.0,mean
ligand_efficiency,MTOR_le,False,0.8,0.3,../data/P42345_ligand_binding.pkl,,,
ligand_efficiency,MTEK1_le,False,0.8,0.3,../data/Q02750_ligand_binding.pkl,,,
Could it be something to do with the ../data/pretrained_vae_model.pt file?
SMILES Parse Error: syntax error while parsing: O=C(c1cn(-c2ccccc2)nc1-c1cccc(F)c1)N1CCC(C2)c2cc(O)c(O)c(C