ComputArtCMCG / PLANET

47 stars 11 forks source link

the protein pdb file need to be fixed #5

Open wangjx22 opened 8 months ago

wangjx22 commented 8 months ago

Dear authors, I have an issue when running the script "python3.6 ../PLANET_run.py -p adrb2.pdb -l adrb2_ligand.sdf -m mols.sdf". The error is "the protein pdb file need to be fixed" in line 21 of PLANET_run.py Could you help me to solve this problem.

yours, Jinxian

aquilazhang commented 8 months ago

Hello, Jinxian: This issue happens when the protein structure (.pdb) is incomplete (missing alpha-carbon atoms). You need to prepare the protein structure with other third-party software.

wangjx22 commented 8 months ago

Thanks for your reply. Could you send me complete data like a protein or a molecule to run the code successfully? My email is wangjx22@m.fudan.edu.cn.

yours, Jinxian

aquilazhang commented 8 months ago

The files in 'demo' folder can be run successfully. Please provide a full traceback to let me see what happens.

wangjx22 commented 8 months ago

I use the demo to run the script "python3.6 ../PLANET_run.py -p adrb2.pdb -l adrb2_ligand.sdf -m mols.sdf". But have an error: Traceback (most recent call last): File "/remote-home/jinxianwang/code/PLANET-main/PLANET_run.py", line 19, in set_pocket_from_ligand self.pocket = ProteinPocket(protein_pdb=protein_pdb,ligand_sdf=ligand_sdf) File "/remote-home/jinxianwang/code/PLANET-main/chemutils.py", line 95, in init ligand = Mol(Chem.SDMolSupplier(ligand_sdf,sanitize=False)[0]) OSError: File error: Bad input file adrb2_ligand.sdf

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/remote-home/jinxianwang/code/PLANET-main/PLANET_run.py", line 174, in predicted_affinities,mol_names,smis = workflow(protein_pdb,mol_file,ligand_sdf,centeriod_x,centeriod_y,centeriod_z) File "/remote-home/jinxianwang/code/PLANET-main/PLANET_run.py", line 107, in workflow estimator.set_pocket_from_ligand(protein_pdb,ligand_sdf) File "/remote-home/jinxianwang/code/PLANET-main/PLANET_run.py", line 21, in set_pocket_from_ligand raise RuntimeError('the protein pdb file need to be fixed') RuntimeError: the protein pdb file need to be fixed

aquilazhang commented 8 months ago

Did you enter the 'demo' dir before running the above command? Or you can run the script with absolute path of required files.

wangjx22 commented 8 months ago

Thanks for your help. I can run the code now, however, I want to test new molecules. Which .py file could transform molecules into an SDF file?

Yours Jinxian

aquilazhang commented 8 months ago

There is no script for transforming molecules into an SDF file in this repository. The transformation is usually performed with OpenBabel or other softwares for molecule modeling.

wangjx22 commented 8 months ago

I have noticed that class VS_SMI_Dataset in PLANET_run.py. So, we also can input the SMILES of molecules? Is it possible to provide input file format in SMILES?

aquilazhang commented 8 months ago

Yes, that's right~ Molecules in SMILES format also works

wangjx22 commented 8 months ago

I have tested my protein, but have 2 errors. the first is Traceback (most recent call last): File "/remote-home/jinxianwang/code/PLANET-main/demo/../PLANET_run.py", line 19, in set_pocket_from_ligand self.pocket = ProteinPocket(protein_pdb=protein_pdb,ligand_sdf=ligand_sdf) File "/remote-home/jinxianwang/code/PLANET-main/chemutils.py", line 100, in init self.res_features = torch.from_numpy(np.concatenate([residue.get_feature() for residue in self.pocket_residues],axis=0)) File "<__array_function__ internals>", line 5, in concatenate ValueError: need at least one array to concatenate

The second is Traceback (most recent call last): File "/remote-home/jinxianwang/code/PLANET-main/demo/../PLANET_run.py", line 182, in predicted_affinities,mol_names,smis = workflow(protein_pdb,mol_file,ligand_sdf,centeriod_x,centeriod_y,centeriod_z) File "/remote-home/jinxianwang/code/PLANET-main/demo/../PLANET_run.py", line 107, in workflow estimator.set_pocket_from_ligand(protein_pdb,ligand_sdf) File "/remote-home/jinxianwang/code/PLANET-main/demo/../PLANET_run.py", line 21, in set_pocket_from_ligand raise RuntimeError('the protein pdb file need to be fixed') RuntimeError: the protein pdb file need to be fixed

I have pasted the protein file 4kcd.pdb.gz , could you help me to check what problem in my file.

aquilazhang commented 8 months ago

I have prepared the protein structure, you can try it out. (unzip it first 4kcd_prepared.zip )

wangjx22 commented 8 months ago

thanks for your help. But, I have the same errors using the prepared protein.

aquilazhang commented 8 months ago

Did you unzip the file? Besides, since you may have trouble in transforming a molecule file into SDF, how did you assign the flag "-l"? Or, you may use the "-x" "-y" "-z" instead of "-l". But I am not sure if you have defined the pocket center correctly.

wangjx22 commented 8 months ago

I am sure unzip the file. I using the 4kcd_prepared.pdb you given to me and mols.sdf in your demo as the input of model. And I did not assign the fiag "-I". I use the script: python ../PLANET_run.py -p 4kcd_prepared.pdb -m mols.sdf

aquilazhang commented 8 months ago

The pocket center is required by PLANET to extract pocket residues from given PDB structure, which is defined as the mass center of molecule through "-l" flag. Or, specify the coordinate of pocket center through "-x" "-y" "-z".

wangjx22 commented 8 months ago

The adrb2_ligand.sdf is necessary? I found adrb2_ligand.sdf is a crystal ligand file for determining binding pocket. If a protein have multi binding pocket, so the adrb2_ligand.sdf should include multi molecules?

aquilazhang commented 8 months ago

It is necessary. The SDF file to define pocket should contain only ONE molecule. For multi-pockets, you need to specify one pocket and run PLANET at a time and do this for several rounds .