UCLOrengoGroup / cath-tools

Protein structure comparison tools such as SSAP and SNAP
http://cath-tools.readthedocs.io
GNU General Public License v3.0
57 stars 14 forks source link

cath-refine-align doesn't respect --align-regions when writing alignments #47

Closed tonyelewis closed 6 years ago

tonyelewis commented 6 years ago

Performing the following:

cath-ssap --pdb-path $PDBDIR 1cuk 1bvs --align-regions 'D[1cukA03]156-203:A' --align-regions 'D[1bvsA03]149-199:A'
cath-refine-align --ssap-aln-infile 1cukA031bvsA03.list --pdb-infile $PDBDIR/1cuk --pdb-infile $PDBDIR/1bvs --align-regions 'D[1cukA03]156-203:A' --align-regions 'D[1bvsA03]149-199:A' --aln-to-ssap-file 1cukA031bvsA03.cath-refine-align.list

...generates a second alignment file in which the numbering starts at 1, not at the start of the specified domain.

tonyelewis commented 6 years ago

I've investigated why this is a problem for cath-refine-align but not cath-superpose and I've found that it's because the alignment_context passed to use_all_alignment_outputters() by cath_superposer::superpose() here is restricted to the relevant regions, whereas the one passed (to the same function) by cath_align_refiner::refine() here isn't.

Recent commits (eg 8078efce5f8a29af24c423c15d7b5a68f2658f67) that make this clearer and make it easy to now switch cath-refine-align over to passing a restricted alignment_context.

But this raises the issue of why the relevant alignment outputter isn't just getting this right using the alignment_context it gets given...

tonyelewis commented 6 years ago

Resolved by 3c2c54954f0ee99313fab00dcd42735e3ec06081 and 363bdb0b00ee6e46548fcae09bf23070f6d96382.

But see #48 for other issues arising from this.