prody / ProDy

A Python Package for Protein Dynamics Analysis
http://prody.csb.pitt.edu
Other
421 stars 154 forks source link

ExtendModel Nodes and Atoms Incompatible #1962

Open rks5854 opened 5 days ago

rks5854 commented 5 days ago

Hi,

Thank you for all the previous help.

I am having issues with the ExtendModel function. It seems like my selections for the backbone and calphas are correct, but I am continuously getting the errors below:

image

This protein has some Zn atoms and some non-standard residues (coded as CY1, they are cysteines which bind the Zn), but I have accounted for that and when looking at the atom counts, backbone = calpha*4 which is correct.

What issue with the model are these errors pointing to, and how can I work to fix it? I have already done some extensive troubleshooting with College of Science IT services and fixed the issue with the non-standard residues, but this is still occurring.

I also exported my backbone as a PDB, and it matched identically with my original PDB, excluding Zn and ligands, so there are no missing residues or other problems with the protein model I can find.

jamesmkrieger commented 5 days ago

It does seem like it should work:

In [13]: ag = parsePDB('prody/tests/datafiles/4akeA_alg_fixed.pdb')
@> 3341 atoms and 1 coordinate set(s) were parsed in 0.03s.

In [14]: anm, ca = calcANM(ag)
@> Hessian was built in 0.05s.
@> 20 modes were calculated in 0.02s.

In [15]: x = extendModel(anm, ca, ag.bb)

In [16]: x
Out[16]: 
(<NMA: Extended ANM 4akeA_alg_fixed (20 modes; 856 atoms)>,
 <AtomMap: 4akeA_alg_fixed from 4akeA_alg_fixed (856 atoms)>)

In [17]: x = extendModel(anm, ca, ag.select('backbone'))

In [18]: x
Out[18]: 
(<NMA: Extended ANM 4akeA_alg_fixed (20 modes; 856 atoms)>,
 <AtomMap: 4akeA_alg_fixed from 4akeA_alg_fixed (856 atoms)>)

Can you email me your pdb file to try at jamesmkrieger@gmail.com?

jamesmkrieger commented 5 days ago

maybe the CY1 is causing the problem

jamesmkrieger commented 5 days ago

I just tried with 1d5m in the pdb that has CY1 and it isn't recognising it in ca or backbone, so I don't think that's the problem.

I've also just tried your step of selecting protein and then selecting bb from the selection and that doesn't seem to be the problem either

Perhaps the problem is that you overrode fex with the selection, but I don't think that's it either. Still it's bad practice

jamesmkrieger commented 5 days ago

ok, replacing the variable isn't it either as this works fine:

In [1]: from prody import *

In [2]: fex = parsePDB('prody/tests/datafiles/4akeA_alg_fixed.pdb')
@> 3341 atoms and 1 coordinate set(s) were parsed in 0.03s.

In [3]: anm_fex, fex_ca = calcANM(fex)
@> Hessian was built in 0.05s.
@> 20 modes were calculated in 0.03s.

In [4]: fex = fex.select('protein')

In [5]: x = extendModel(anm_fex, fex_ca, fex.select('backbone'))

In [6]: x
Out[6]: 
(<NMA: Extended ANM 4akeA_alg_fixed (20 modes; 856 atoms)>,
 <AtomMap: 4akeA_alg_fixed from 4akeA_alg_fixed (856 atoms)>)
jamesmkrieger commented 5 days ago

Can you give me your previous commands of how you generated fex_ca and anm_fex in case there's something else strange there? I shouldn't think there would be though

rks5854 commented 5 days ago

In [4]: fex = parsePDB('comp_min.pdb') @> 136109 atoms and 1 coordinate set(s) were parsed in 1.49s.

In [5]: fex = fex.select('protein or resname CY1')

In [6]: fex_ca = fex.select('protein name CA or resname CY1 name CA')

In [7]: fex Out[7]: <Selection: 'protein or resname CY1' from comp_min (10945 atoms)>

In [8]: fex_ca Out[8]: <Selection: '(protein name C...or resname CY1)' from comp_min (681 atoms)>

In [11]: fex_bb = fex.select('protein name CA C N O or resname CY1 CA C N O')

In [12]: fex_bb Out[12]: <Selection: '(protein name C...or resname CY1)' from comp_min (2820 atoms)>

In [13]: fex_ca Out[13]: <Selection: '(protein name C...or resname CY1)' from comp_min (681 atoms)>

In [14]: fex_anm = ANM('fex_ca')

In [15]: fex_anm.buildHessian(fex_ca) @> Hessian was built in 0.15s.

In [16]: fex_anm.calcModes() @> 20 modes were calculated in 0.41s.

In [17]: bb_fex_anm, bb_fex_atoms = extendModel(fex_anm, fex_ca, fex_bb)

ValueError Traceback (most recent call last) Cell In[17], line 1 ----> 1 bb_fex_anm, bb_fex_atoms = extendModel(fex_anm, fex_ca, fex_bb)

File ~/miniconda3/envs/py39_prody/lib/python3.9/site-packages/prody/dynamics/editing.py:44, in extendModel(model, nodes, atoms, norm) 41 if model.numAtoms() != nodes.numAtoms(): 42 raise ValueError('atom numbers must be the same') ---> 44 indices, atommap = extendAtoms(nodes, atoms, model.is3d()) 46 evecs = evecs[indices, :] 47 if norm:

File ~/miniconda3/envs/py39_prody/lib/python3.9/site-packages/prody/atomic/functions.py:349, in extendAtoms(nodes, atoms, is3d) 347 raise ValueError('atoms must contain a residue for all atoms') 348 if isinstance(res, list): --> 349 raise ValueError('not enough data to get a single residue for all atoms') 351 res_atom_indices = res._getIndices() 352 if not fastin(res, residues):

ValueError: not enough data to get a single residue for all atoms

jamesmkrieger commented 5 days ago

Ok, the problem is with cases where it gets confused and maps other atoms as parts of the residue initially and then removes them. It creates a list of residues and so it thinks that it can't find enough nodes to give them data, but in this case it's a list of 1 residue so we just need to extract it.

The problem residue for some reason was GLN 357 but I've no idea why

jamesmkrieger commented 5 days ago

After installing the development version of prody from github, you can now checkout my branch expand_extract to try the fixed version

jamesmkrieger commented 18 hours ago

the pull request is now merged so you should be fine with the main prody on github

rks5854 commented 14 hours ago

thank you!! is there anything i need to install locally to get it working on my end?

jamesmkrieger commented 14 hours ago

what do you have installed currently?

rks5854 commented 14 hours ago

I genuinely have no idea how to find out, but I have not changed anything since using the command line install from the site

jamesmkrieger commented 12 hours ago

Ok, then yes, you need to install again for sure

what you should do is clone prody from GitHub and install it as an update/upgrade (with U) and in development mode (e for editable) and build the c extensions too:

git clone https://github.com/prody/ProDy.git pip install -Ue . python setup.py build_ext --inplace --force

This should install from the main branch by default