mbreese / swalign

Smith-Waterman local aligner
Other
67 stars 22 forks source link

q_pos and r_pos indexing #11

Open paulaberry opened 3 years ago

paulaberry commented 3 years ago

The outputs from alignment.dump() indicate a 1-indexed position number for the start and end of the matching region of the query and reference sequence. However, the output from alignment.r_pos and alignment.q_pos are 0-indexed. The output from alignment.r_end and alignment.q_end are the same as in the output from alignment.dum() and so appear to be 1-indexed. Is this expected behavior and output?

vpradeep07 commented 8 months ago

Yes, I noticed this as well, and I agree that the 1-indexed printout of the query position is confusing, especially since the internal state variable q_pos is 0-indexed. While I'm not the author (I'm a new user of swalign), I believe this 1-indexing is expected behavior. See code snippet below:

https://github.com/mbreese/swalign/blob/df1f7f4c7114fc3001c2791984ff8fc5ea3e8830/swalign/__init__.py#L446-L458

As shown, the author explicitly is adding +1 to the query pos to make it print 1-indexed.

mbreese commented 8 months ago

Yes, you're correct. This is the expected behavior. Typically coordinates that are human readable are reported out as 1-based values.