Adds a function to remove operons that are roughly identical within the bounds of the identified features. This can result in false positives - that is, operons that in reality are unique but are flagged as redundant, however, the method is quite fast and in reality, such false positives probably vary by only a few nucleotides.
The integration test CSV contains three nucleotide sequences: the unaltered sequence, its reverse complement, and the forward sequence with a single nucleotide deleted.
Note that this mostly ignores CRISPR arrays. It makes sure they're in same order in the overall motif, but doesn't verify their exact positions or sequences. This is because pilercr is so context-sensitive that it gives different results even on an exact reverse complement.
Adds a function to remove operons that are roughly identical within the bounds of the identified features. This can result in false positives - that is, operons that in reality are unique but are flagged as redundant, however, the method is quite fast and in reality, such false positives probably vary by only a few nucleotides.
The integration test CSV contains three nucleotide sequences: the unaltered sequence, its reverse complement, and the forward sequence with a single nucleotide deleted.
Note that this mostly ignores CRISPR arrays. It makes sure they're in same order in the overall motif, but doesn't verify their exact positions or sequences. This is because pilercr is so context-sensitive that it gives different results even on an exact reverse complement.