haddocking / pdb-tools

A dependency-free cross-platform swiss army knife for PDB files.
https://haddocking.github.io/pdb-tools/
Apache License 2.0
369 stars 112 forks source link

[FEATURE] improved/corrected `pdb_selaltloc` tool #156

Open joaomcteixeira opened 1 year ago

joaomcteixeira commented 1 year ago

Improves pdb_selaltloc tool:

I think we should bump minor because of the importance of this update. Thanks to everyone participating in the discussions :clap:

I cannot add you as a reviewer @mgiulini; consider yourselves a reviewer :wink:

joaomcteixeira commented 1 year ago

Actions will work after #157

mgiulini commented 1 year ago

hey there, I don't know whether this PR is closed, but I found a bug in the new implementation of the code. Take pdb file 2qxf: if you run the new selaltloc you'll get duplicate rows for all the residues that have 3 alternative locations, two of which with equal probability weight. An example is MET A 235:

ATOM   1924  N  AMET A 235      30.258  16.409   7.504  0.35 13.81           N
ATOM   1925  N  BMET A 235      30.226  16.401   7.511  0.35 13.80           N
ATOM   1926  N  CMET A 235      30.284  16.403   7.495  0.30 13.83           N
ATOM   1927  CA AMET A 235      31.514  15.799   7.009  0.35 13.83           C
ATOM   1928  CA BMET A 235      31.457  15.715   7.065  0.35 13.83           C
ATOM   1929  CA CMET A 235      31.587  15.874   6.971  0.30 13.85           C

becomes

ATOM   1924  N   MET A 235      30.258  16.409   7.504  0.35 13.81           N
ATOM   1925  N   MET A 235      30.226  16.401   7.511  0.35 13.80           N
ATOM   1927  CA  MET A 235      31.514  15.799   7.009  0.35 13.83           C
ATOM   1928  CA  MET A 235      31.457  15.715   7.065  0.35 13.83           C

I think that the problem lies here https://github.com/joaomcteixeira/pdb-tools/blob/0981c5a174e79490068bec327dc337cc3cb492d6/pdbtools/pdb_selaltloc.py#L287-L292

hope it helps!