gcorso / DiffDock

Implementation of DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking
https://arxiv.org/abs/2210.01776
MIT License
976 stars 238 forks source link

PDBConstructionWarning and error code: 2 #195

Open maciejwisniewski-drugdiscovery opened 4 months ago

maciejwisniewski-drugdiscovery commented 4 months ago

Hello, as part of my project I want to redock several hundred structures of complexes. But I get some errors. My command:

python -m inference --config default_inference_args.yaml--protein_path /mnt/evafs/groups/sfglab/mwisniewski/PhD/data/test/4kwg/4kwg_protein.pdb --ligand_description /mnt/evafs/groups/sfglab/mwisniewski/PhD/data/test/4kwg/4kwg_ligand.sdf --out_dir /mnt/evafs/groups/sfglab/mwisniewski/PhD/data/test/4kwg --complex_name diffdock

And then this error occures looped several dozen times for subsequent hydrogen atoms.

warnings.warn( /mnt/evafs/groups/sfglab/mwisniewski/anaconda3/envs/diffdock/lib/python3.9/site-packages/Bio/PDB/PDBParser.py:340: PDBConstructionWarning: PDBConstructionException: Atom H defined twice in residue <Residue SER het= resseq=500 icode= > at line 30410. Exception ignored. Some atoms or residues may be missing in the data structure.

And after that: warnings.warn( Processing 1 of 1 batches (4 sequences) HAPPENING | confidence model uses different type of graphs than the score model. Loading (or creating if not existing) the data for the confidence model now. /mnt/evafs/groups/sfglab/mwisniewski/anaconda3/envs/diffdock/lib/python3.9/site-packages/torch/jit/_check.py:181: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types ininit. Instead, either 1) use a type annotation in the class body, or 2) wrap the type intorch.jit.Attribute. warnings.warn("The TorchScript type system doesn't support " Size of test dataset: 1 0it [00:00, ?it/s]@> 30408 atoms and 1 coordinate set(s) were parsed in 0.23s. /mnt/evafs/groups/sfglab/mwisniewski/software/docking_software/DiffDock-1.1/datasets/parse_chi.py:91: RuntimeWarning: invalid value encountered in cast Y = indices.astype(int) @> 30408 atoms and 1 coordinate set(s) were parsed in 0.30s. Failed on ['diffdock'] linalg.svd: (Batch element 9): The algorithm failed to converge because the input matrix is ill-conditioned or has too many repeated singular values (error code: 2). 1it [08:00, 480.27s/it] Failed for 1 complexes Skipped 0 complexes

My protein pdb file is preprocessed with openbabel by removing solvants, adding hydrogens, etc. The file is attached. 4kwg_protein.zip Maciek