zhaoyanswill / RAPSearch2

Reduced Alphabet based Protein similarity Search
40 stars 15 forks source link

Midline too short in some cases #30

Open bachev opened 7 years ago

bachev commented 7 years ago

The midline in alignments has a different length from the query and Sbjct line, leading to parsing errors later on when writing/using parsers. I have attached an example which demonstrates this.

When run, one will get an alignment like this: gi|255767013:c2429585-2428389 vs RIBBA_PORGI ... Query: 733 EEPVLVRVHSECLTGDVFGSHRCDCG... ........... EP+LVR+HS C TGD+FGS RCDCG... Sbjct: 250 NEPILVRMHSSCATGDIFGSMRCDCG...

Notice how the midline has a blank as "first" alignment character. In XML output, the blank is even completely swallowed: <Hsp_qseq>EEPVLVRVHSECLT... <Hsp_hseq>NEPILVRMHSSCAT... <Hsp_midline>EP+LVR+HS C...

I think the error is triggered by the fast mode.

Bastien

bachev commented 7 years ago

And of course I forgot to attach the demo files. Here they are. rs_midline_error.tar.gz

zhaoyanswill commented 7 years ago

Thank you for using RAPSearch2! I’ll look into it!

On Sep 27, 2016, at 11:51 AM, Bastien Chevreux notifications@github.com wrote:

And of course I forgot to attach the demo files. Here they are. rs_midline_error.tar.gz https://github.com/zhaoyanswill/RAPSearch2/files/496156/rs_midline_error.tar.gz — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/zhaoyanswill/RAPSearch2/issues/30#issuecomment-249925819, or mute the thread https://github.com/notifications/unsubscribe-auth/AETkp1fOEP2Cb0xZ7Td6-zRmihYn4tS_ks5quUmBgaJpZM4KH3gn.