forlilab / Meeko

Interfacing RDKit and AutoDock
GNU Lesser General Public License v2.1
192 stars 48 forks source link

Raise RuntimeError when MolFromPDBBlock failed to return a mol #178

Closed rwxayheee closed 1 week ago

rwxayheee commented 1 week ago

In mk_prepare_receptor, RDKit's Chem.MolFromPDBBlock is used to build input_mols per residue, when --pdb is used to parse receptor PDB file. Sometimes, MolFromPDBBlock failed to return a Mol if residue's structure is corrupt.

Currently, problems like:

[23:04:08] Explicit valence for atom # 1 O, 3, is greater than permitted

would not terminate the process. The incomplete input_mols are carried on for chorizo building.

The change raises RuntimeError when MolFromPDBBlock failed to return a mol, and provide hopefully helpful information to locate the residue that didn't go through MolFromPDBBlock.

Before the change, the error message is displayed:

Traceback (most recent call last):
  File "/Users/amyhe/micromamba/envs/mk_dev/bin/mk_prepare_receptor.py", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/Users/amyhe/Desktop/0_forks/Meeko/scripts/mk_prepare_receptor.py", line 437, in <module>
    chorizo = LinkedRDKitChorizo.from_pdb_string(
  File "/Users/amyhe/Desktop/0_forks/Meeko/meeko/linked_rdkit_chorizo.py", line 741, in from_pdb_string
    bonds = find_inter_mols_bonds(raw_input_mols)
  File "/Users/amyhe/Desktop/0_forks/Meeko/meeko/linked_rdkit_chorizo.py", line 149, in find_inter_mols_bonds
    positions = mol.GetConformer().GetPositions()
AttributeError: 'NoneType' object has no attribute 'GetConformer'

After, the process is terminated immediately when Chem.MolFromPDBBlock failed on a residue:

Traceback (most recent call last):
  File "/Users/amyhe/micromamba/envs/mk_dev/bin/mk_prepare_receptor.py", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/Users/amyhe/Desktop/0_forks/Meeko/scripts/mk_prepare_receptor.py", line 437, in <module>
    chorizo = LinkedRDKitChorizo.from_pdb_string(
  File "/Users/amyhe/Desktop/0_forks/Meeko/meeko/linked_rdkit_chorizo.py", line 740, in from_pdb_string
    raw_input_mols = cls._pdb_to_residue_mols(pdb_string)
  File "/Users/amyhe/Desktop/0_forks/Meeko/meeko/linked_rdkit_chorizo.py", line 1468, in _pdb_to_residue_mols
    raise RuntimeError(f"An error occurred while trying to build pdb mol for {reskey} {resname}")
RuntimeError: An error occurred while trying to build pdb mol for A:30 DT
diogomart commented 1 week ago

Awesome thanks!

Minor request: change if pdbmol: to if pdbmol is not None: to follow Google's style guide https://google.github.io/styleguide/pyguide.html#2144-decision

rwxayheee commented 1 week ago

Minor request: change if pdbmol: to if pdbmol is not None: to follow Google's style guide

Understood! use if pdbmol is None to check for failure in 4658a56

rwxayheee commented 1 week ago

Hi @diogomart Thanks again for pointing to the styleguide. Sorry I wasn't home so I didn't taike a very careful look.. I just realized that the original PR (before https://github.com/forlilab/Meeko/pull/178/commits/4658a56aec224d646f1b950206525e5396bb4208) uses implicit False, which is compliant with the styleguide:

2.14.4 Decision Use the “implicit” false if possible, e.g., if foo: rather than if foo != []:. There are a few caveats that you should keep in mind though: ... Yes: if not users: print('no users')

Therefore, I don't think https://github.com/forlilab/Meeko/pull/178/commits/4658a56aec224d646f1b950206525e5396bb4208 is compliant. But I also don't have a strong preference, so please don't feel pressured to revert. I did the original because lazy

rwxayheee commented 1 week ago

Ah nevermind I saw this

Always use if foo is None: (or is not None) to check for a None value. E.g., when testing whether a variable or argument that defaults to None was set to some other value. The other value might be a value that’s false in a boolean context!

Sorry for the confusion, I will merge https://github.com/forlilab/Meeko/commit/4658a56aec224d646f1b950206525e5396bb4208