BioJulia / BioAlignments.jl

Sequence alignment tools
MIT License
60 stars 24 forks source link

[Feature]: Add method to remove sequence match info from cigar #90

Open MillironX opened 1 year ago

MillironX commented 1 year ago

Expected Behavior

I would like a way to tell if the two alignments are the same regardless of sequence, such that

Alignment("1=1X") == Alignment("2M")

Current Behavior

Alignment("1=1X") == Alignment("2M")

# returns false

:point_up: This behavior is correct!

Possible Solution / Implementation

The current behavior is entirely correct, but it doesn't let me compare alignments that have matching against those that don't. I propose a new function

remove_match_ops(::T) where {T<:Union{String,Alignment,AlignedSequence,PairwiseAlignment,PairwiseAlignmentResult}}

that would remove the = (sequence match) and X (sequence mismatch) operations from the CIGAR of the alignment and return a T where those operations would be replaced by M (match) operations and adjacent match operations merged.

kescobo commented 1 year ago

I don't work much with CIGAR, but I can't think of an objection