OpenBioSim / sire

Sire Molecular Simulations Framework
https://sire.openbiosim.org
GNU General Public License v3.0
41 stars 11 forks source link

[BUG] mol.move().align() fails to align two proteins #192

Closed akalpokas closed 4 months ago

akalpokas commented 6 months ago

Describe the bug This might more of a feature request rather than a bug report per se, since I don't think mol.move().align() was orignally designed to work with proteins.

I am trying to align two proteins that both have distinct configurations. Sequence wise, these proteins are nearly identical except for residue ID 9 between them. I have computed the mapping between these two proteins (mutant to wild-type) and want to align the wild-type protein to the mutant one (using the inverse mapping), so that I can make a perturbable protein with the mutant conformation. However, when I try to use BioSimSpace.Align.rmsdAlign() function (which wraps around sire.mol.move().align()) or sire.mol.move().align() directly, the resulting structure that I get is just the wild-type protein, without any alignment performed. The function also executes without any errors.

If I try to use BioSimSpace.Align.flexAlign() function (which uses fkcombu) with the same mapping, the two proteins can be aligned properly, however this takes a really long time to compute.

I suspect that even in the case where alignment is being done between two very similar conformations, the residues between the two proteins will not be aligned properly. I believe this should be possible to be fixed by looping over each residue in the target protein and aligning them with the reference structure. I could also circumnavigate this issue by extracting the residues of interest, aligning them individually and updating the coordinates of the target (wild-type) residue so that during the merge part the coordinates between the hybrid residues won't be an issue.

To Reproduce Steps to reproduce the behavior:

  1. Extract the provided inputs.tar.gz file
  2. Run the script align.py via python

Expected behavior Alignment between two proteins in such a way that the saved output files (aligned_wt_rmsd_align.pdb/mol1_aligned.pdb) have the conformation of the mutant protein (frame_0.gro).

Input files inputs.tar.gz

Environment information

lohedges commented 6 months ago

The sire RMSD alignment just does rigid-body translation and rotations, so I assume that this is failing when the mapping is too large. It seems to work for me if I align just based on the sub-mapping for the region of interest, not the full one. (This is similar to what you suggest, i.e. just aligning the two residues then shifting everything else based on the translation and rotation vectors, but is probably easier.)

akalpokas commented 6 months ago

I have been able to temporarily circumnavigate the issue by extracting all of the residues from both proteins, aligning them individually and then using the updated coordinates to update the coordinates of the input protein. The alignment isn't ideal, but it does the trick for now.

akalpokas commented 4 months ago

Closing as per-residue-alignment code has been added to BioSimSpace