Open FlorianBrnrd opened 5 years ago
The end positions of the alignment for both query and reference is readily available using the Result
properties end_query
and end_ref
because those are calculated during the alignment step. You would need to use the Cigar
object to get the beginning positions beg_query
and beg_ref
because they are calculated during the traceback.
Thank you for the very quick answer ! I managed to get what I wanted thanks to your help, however I seem to have a encountered a strange behavior while looking at the start positions of both the query and reference.
I used this bit of code:
aln = pair_align(SEQ, ref)
cigar = aln.cigar
print(aln.traceback.ref)
print(aln.traceback.comp)
print(aln.traceback.query)
print('\n')
print('query end = {}'.format(aln.end_query))
print('ref end = {}'.format(aln.end_ref))
print('query start = {}'.format(cigar.beg_query))
print('ref start = {}'.format(cigar.beg_ref))
And here's what it returns:
I wonder why it it returns 0 for the beginning of the reference alignment when I can clearly see the alignment starts around 10 ? Or did I misunderstood something ?
Thank you.
I have this same question — cigar.beg_query
and cigar.beg_ref
are 0.
Still when using cigar.beg_ref
is 0. Here is a partial solution:
import re
import parasail
# Perform the sequence Align
result = parasail.sw_trace_striped_64(read, ref, 3, 2, parasail.dnafull)
# Decode CIGAR using UTF-8
new_cigar = result.cigar.decode.decode('utf-8')
# 'D' since is a deletion when compared to reference
no_align_pattern = r'(\d+)D'
# Search pattern if exists
match = re.search(no_align_pattern, new_cigar)
# If a match is found, extract the number of 'deleted' positions and convert to integer
if match:
no_align_positions = match.group(1)
new_start = int(no_align_positions)
else:
new_start = 0
I would like to have some updates in that regard. Many thanks.
Hi, I'm performing semi-global alignments and I was wondering if there's any way to directly get the position of the beginning and end of the alignment on the reference ? Or do I have to use the cigar string to get the information ? Thanks a lot for your help. Florian.