I have some reads with UMI(Unique Molecular Identifier). The reads contians same UMI are from the same sequence template. They will merged to a consensus reads which some bases are ambiguous. I use the consensus reads mapping to the reference genome, and here is mapping results:
As the seqence I highlighted with red rectangle, there is an ambiguity bases 'N' in the reads. I want to know why it is 'A' (base from reference genome ) but not 'N' ( base from reads) in 'MD' tag(0T85A5).
Is there any official documentation on how aligner handles the ambiguity bases?
I have some reads with UMI(Unique Molecular Identifier). The reads contians same UMI are from the same sequence template. They will merged to a consensus reads which some bases are ambiguous. I use the consensus reads mapping to the reference genome, and here is mapping results: As the seqence I highlighted with red rectangle, there is an ambiguity bases 'N' in the reads. I want to know why it is 'A' (base from reference genome ) but not 'N' ( base from reads) in 'MD' tag(0T85A5). Is there any official documentation on how aligner handles the ambiguity bases?
Thanks!