markovmodel / PyEMMA

🚂 Python API for Emma's Markov Model Algorithms 🚂
http://pyemma.org
GNU Lesser General Public License v3.0
311 stars 119 forks source link

Bug in sidechain torsion naming in featurizer #1451

Closed mandar5335 closed 4 years ago

mandar5335 commented 4 years ago

Hi, I am using chi1,chi2 sidechain torsions as features for a residue in a protein using add_sidechain_torsions method. However, It seems like a bug in the featurizer as it denotes both chi1 and chi2 torsions as chi1. The problem is only in the naming but features correspond to relevant torsions correctly.

I have attached the corresponding PDB file. The sample code is described below:

featurizer = pyemma.coordinates.featurizer('prot_maeconv.pdb')
sidechain = featurizer.add_sidechain_torsions(selstr='resid 34', cossin=True, which=['chi1','chi2'], periodic=True )
backbone = featurizer.add_backbone_torsions(selstr='resid 34', cossin=True, periodic=True)
featurizer.describe()

output:

['COS(CHI1 0 TYR 35)',
 'SIN(CHI1 0 TYR 35)',
 'COS(CHI1 0 TYR 35)',
 'SIN(CHI1 0 TYR 35)',
 'COS(PHI 0 TYR 35)',
 'SIN(PHI 0 TYR 35)',
 'COS(PSI 0 TYR 35)',
 'SIN(PSI 0 TYR 35)']

I am using PyEMMA 2.5.7 on Ubuntu 18.04 OS.

prot_maeconv.pdb.txt

Thanks, Mandar Kulkarni

thempel commented 4 years ago

Hi Mandar, thanks for reporting this bug and for providing a working minimal example code. I've repurposed it into a unit test with our internal test system. The bug was basically an indexing problem in the description function. Once the PR is merged, I'd be happy if you could test it.