mengyao / Complete-Striped-Smith-Waterman-Library

301 stars 115 forks source link

why local alignment like Smith-Waterman not always symmetric? #85

Closed rocke2020 closed 1 year ago

rocke2020 commented 1 year ago

Dear Zhao mengyao,

Not a issue, but a question Is a local alignment like Smith-Waterman typically/always considered to be symmetric? I.e. is score of alignment(s,t) the same as score of alignment(t,s)? thanks a lot in advance!! In my test for long sequence pair, it is asymmetric; for short sequence pair, it is the symmetric, that's the same score .

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD python pyssw.py -p protein.fasta protein.fasta > protein_output.txt

a protein.fasta, cause not symmetric score

0 IIGGEFTTIENQPWFAAIYRRHRGGSVTYVCGGSLISPCWVISATHCFIDYPKKEDYIVYLGRSRLNSNTQGEMKFEVENLILHKDYSADTLAYHNDIALLKIRSKEGRCAQPSRTIQTIALPSMYNDPQFGTSCEITGFGKEQSTDYLYPEQLKMTVVKLISHRECQQPHYYGSEVTTKMLCAADPQWKTDSCQGDSGGPLVCSLQGRMTLTGIVSWGRGCALKDKPGVYTRVSHFLPWIRSHTKE 86 LKWSKMNLTYRIVNYTPDMTHSEVEKAFKKAFKVWSDVTPLNFTRLHDGIADIMISFGIKEHGDFYPFDGPSGLLAHAFPPGPNYGGDAHFDDDETWTSSSKGYNLFLVAAHEFGHSLGLDHSKDPGALMFPIYTYTHFMLPDDDVQGIQSLYGPXXXXXX

ez2rok commented 1 year ago

The Smith-Waterman algorithm should always be symmetric, i.e. we should have alignment(s,t) = alignment(t,s). This is because we want to align the subsequence of s and subsequence of t which have the lowest cost. This should not change if we switch s and t.

However, in semi-global alignment this does not hold, i.e. sg_alignment(s, t) != sg_alginment(t, s). In semi-global alignment the first string is locally aligned and the second string is globally aligned. Thus switching the order of s, t would make a difference.

Hope this helps!

mengyao commented 1 year ago

I think local alignment can be asymmetric, though this situation happens in low frequency, especially for DNA alignment.