Layne-Huang / PMDM

96 stars 21 forks source link

Bugs in sample_linker.py #11

Closed FeilongWuHaa closed 5 months ago

FeilongWuHaa commented 5 months ago

Hi,author,

1. wrong checkpoint name when I tried to generate linker for custom pocket and molecule with command: " python -u sample_linker.py \ --ckpt 500.pt \ --pdb_path 3WZE/pocket.pdb \ --mol_file 3WZE/3wze-mol.sdf \ --mask 0 1 2 3 4 5 6 7 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32\ --num_atom 4 \ --num_samples 100 \ --sampling_type generalized \ --batch_size 32 "

but I got a error: " _Traceback (most recent call last): File "/home/PMDM/sample_linker.py", line 293, in datalist, = construct_dataset_pocket(num_samples*1,batch_size,dataset_info,num_points,None,start_linker,None, File "/home/PMDM/utils/sample.py", line 154, in construct_dataset_pocket atom_type_linker = torch.cat([start_linker['linker_atom_type'], atomtype]) RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 10 but got size 8 for tensor number 1 in the list. "

if I rename the checkpoint to "500_pocket.pt", this bug have been fixed.

2. missing QVinaDockingTask in sample_linker.py QVinaDockingTask.from_generated_data(protein_filename,gmol,protein_root) line 391.

Layne-Huang commented 5 months ago

Thanks for your feedback.

  1. I have fixed the bug of sample_linker.py.
  2. The QVinaDockingTask file exits in evaluation/docking.py. I suggest you to comment this code since it is time consuming to calculate the vina score. Instead, you could evaluate them independently.
FeilongWuHaa commented 5 months ago

Thank you @Layne-Huang

FeilongWuHaa commented 5 months ago

Hi,

by, your new version sample_linker.py files, when I tried to generate linker for custom pocket and molecule with command: " _python -u sample_linker.py --ckpt 500.pt --pdb_path 3WZE/pocket.pdb --mol_file 3WZE/3wze-mol.sdf --mask 0 1 2 3 4 5 6 7 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 --num_atom 4 --num_samples 100 --sampling_type generalized --batch_size 32 "

there is not any molecule gengeraed, and I get an error of Open Babel and Vina: " _generated smile: [C][C][C][C][N]C([N])=O largest generated smile part: [C][C][C][C][N]C([N])=O Generate SA score: 0.39 Generate QED score: 0.44795408144569576 Generate logP: -0.4310400000000001 Generate Lipinski: 5 [b'1 molecule converted\n', b'/bin/bash: line 10: ./qvina2.1: No such file or directory\n', b'==============================\n', b'*** Open Babel Error in OpenAndSetFormat\n', b' Cannot open bzkdjhxnzncstkznbnpmukvmuncykr_ligand_out.pdbqt\n', b'0 molecules converted\n'] [Error] Vina output error: /home/PMDM/tmp/bzkdjhxnzncstkznbnpmukvmuncykr_ligandout.sdf 'NoneType' object is not subscriptable "

for vina “./qvina2.1: No such file or directory”, it need put the qvina in the ./tmp foler. Then all codes run well.

But, the generate molecule is quit small, and It is only the part of linker. How can I get the gennerated whole molecules with fragments?

--mask argument should be the atoms that you wanna mask. This is an example that I used. I have masked the following atoms. image The command is python -u sample_linker.py --ckpt 500.pt --pdb_path 3wzecut20/3wzecut20_pocket.pdb --mol_file 3wzecut20/3wzecut20_ligand.sdf --mask 6 7 8 9 10 11 --num_atom 4 --num_samples 1 --sampling_type generalized --batch_size 1 -build_method reconstruct.

The results are

generated smile: CNC(=O)c1cc(CC#CN(C)C(=O)Nc2ccc(Cl)c(C(F)(F)F)c2)c(O)cn1 largest generated smile part: CNC(=O)c1cc(CC#CN(C)C(=O)Nc2ccc(Cl)c(C(F)(F)F)c2)c(O)cn1 Generate SA score: 0.76 Generate QED score: 0.5027894126821064 Generate logP: 3.4863000000000013 Generate Lipinski: 5 [b'1 molecule converted\n', b'9 molecules converted\n'] Best affinity: -11.6 Generate vina score: -11.6

image

If you want to use Vina, please follow our tutorial to install the environment (please use the newest /evaluation/docking.py). Otherwise, you could comment the codes.

FeilongWuHaa commented 5 months ago

It works now, thanks.