chemosim-lab / ProLIF

Interaction Fingerprints for protein-ligand complexes and more
https://prolif.readthedocs.io
Apache License 2.0
337 stars 66 forks source link

Hydrogen atom recognition problem #203

Closed williambrown3 closed 1 month ago

williambrown3 commented 1 month ago

When attempting to extract VdWContacts between protein alpha carbons and nucleic acid phospates, the following error causes the calculation to fail.

multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/wbrown2/miniconda3/envs/prolif/lib/python3.12/site-packages/multiprocess/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/home/wbrown2/miniconda3/envs/prolif/lib/python3.12/site-packages/multiprocess/pool.py", line 48, in mapstar
    return list(map(*args))
           ^^^^^^^^^^^^^^^^
  File "/home/wbrown2/miniconda3/envs/prolif/lib/python3.12/site-packages/prolif/parallel.py", line 142, in executor
    lig_mol = Molecule.from_mda(lig, **cls.converter_kwargs[0])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wbrown2/miniconda3/envs/prolif/lib/python3.12/site-packages/prolif/molecule.py", line 121, in from_mda
    mol = ag.convert_to.rdkit(**kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wbrown2/miniconda3/envs/prolif/lib/python3.12/site-packages/MDAnalysis/converters/RDKit.py", line 361, in convert
    mol = atomgroup_to_mol(ag, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wbrown2/miniconda3/envs/prolif/lib/python3.12/site-packages/MDAnalysis/converters/RDKit.py", line 420, in atomgroup_to_mol
    raise AttributeError(
AttributeError: No hydrogen atom could be found in the topology, but the converter requires all hydrogens to be explicit. You can use the parameter ``NoImplicit=False`` when using the converter to allow implicit hydrogens and disable inferring bond orders and charges. You can also use ``force=True`` to ignore this error.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/work/LAS/potoyan-lab/chris/Var-MS-a/new_plf_pn.py", line 43, in <module>
    fp.run(univ.trajectory[10000:10001:1], na_group, aa_group, n_jobs=36)
  File "/home/wbrown2/miniconda3/envs/prolif/lib/python3.12/site-packages/prolif/fingerprint.py", line 472, in run
    return self._run_parallel(
           ^^^^^^^^^^^^^^^^^^^
  File "/home/wbrown2/miniconda3/envs/prolif/lib/python3.12/site-packages/prolif/fingerprint.py", line 524, in _run_parallel
    for ifp_data_chunk in pool.process(args_iterable):
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wbrown2/miniconda3/envs/prolif/lib/python3.12/site-packages/prolif/parallel.py", line 167, in process
    return self.pool.map(self.executor, args_iterable, chunksize=1)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wbrown2/miniconda3/envs/prolif/lib/python3.12/site-packages/multiprocess/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wbrown2/miniconda3/envs/prolif/lib/python3.12/site-packages/multiprocess/pool.py", line 774, in get
    raise self._value
AttributeError: No hydrogen atom could be found in the topology, but the converter requires all hydrogens to be explicit. You can use the parameter ``NoImplicit=False`` when using the converter to allow implicit hydrogens and disable inferring bond orders and charges. You can also use ``force=True`` to ignore this error.

The code that I am running is shown below.

top = load_file('topology.top', xyz='gromacs.gro')
traj = 'trajectory.dcd'

u = mda.Universe(top, traj)

tm3 = u.select_atoms('protein and name CA')
prot = u.select_atoms('nucleic and name P')

fp = plf.Fingerprint(['VdWContact'])

fp.run(u.trajectory[4000:4001:1], tm3, prot)

I followed the installation according to the Prolif documentation and I cannot figure out what is the cause of this error. I have also successfully run prolif contact calculations on this same data in the past with an almost identical script.

cbouy commented 1 month ago

Hi,

This should work:

# disable charge and bond order inferring (not required for VdWContact)
converter_params = {"NoImplicit": False}
fp.run(u.trajectory[4000:4001:1], tm3, prot, converter_kwargs=(converter_params, converter_params))

The error message above was due to your atom selections not including any hydrogens, which prevents the creation of a "complete" molecule (i.e. with bond orders and formal charges instead of just the topology) when converting the MDAnalysis selection to an RDKit molecule. But because you're only calculating the VdWContact interaction it is safe to skip this step by passing NoImplicit=False as stated in the error message. (and you have to pass it twice because once for each component in your complex).

Hope this helps!

Feel free to reopen the issue if you have other related questions

williambrown3 commented 1 month ago

Thanks so much for the quick help. That worked great.

Get Outlook for iOShttps://aka.ms/o0ukef


From: Cédric Bouysset @.> Sent: Thursday, May 9, 2024 4:01:49 PM To: chemosim-lab/ProLIF @.> Cc: Brown, William H [CHEM] @.>; Author @.> Subject: Re: [chemosim-lab/ProLIF] Hydrogen atom recognition problem (Issue #203)

Hi,

This should work:

disable charge and bond order inferring (not required for VdWContact)

converter_params = {"NoImplicit": False} fp.run(u.trajectory[4000:4001:1], tm3, prot, converter_kwargs=(converter_params, converter_params))

The error message above was due to your atom selections not including any hydrogens, which prevents the creation of a "complete" molecule (i.e. with bond orders and formal charges instead of just the topology) when converting the MDAnalysis selection to an RDKit molecule. But because you're only calculating the VdWContact interaction it is safe to skip this step by passing NoImplicit=False as stated in the error message. (and you have to pass it twice because once for each component in your complex).

Hope this helps!

Feel free to reopen the issue if you have other related questions

— Reply to this email directly, view it on GitHubhttps://github.com/chemosim-lab/ProLIF/issues/203#issuecomment-2103413966, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AR6HSJKJVWSGNAFL2BEL5JLZBPP33AVCNFSM6AAAAABHPJBJLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBTGQYTGOJWGY. You are receiving this because you authored the thread.Message ID: @.***>