Closed zhoudreames closed 2 years ago
Hi, can you attach the two sequences here?
Pls also mention the command line parameters I should use to reproduce this
fa.zip mashmap -r 2.fa -q 1.fa
Thanks for sharing the sequences. It looks like the longer sequence (-q 1.fa, query) is a tandem repeat. Mashmap splits the longer sequence into non-overlapping sequences; each of them align to the reference (short sequence 2.fa). At this point, it should have ideally reported individual alignments but it incorrectly merges all alignments because they are located within a threshold range on the reference sequence. The merging algorithm in Mashmap can be improved, I think a co-linear chaining algorithm for merging would be better instead.
Anyways, I won't recommend using Mashmap for tandem repeats like these; k-mer jaccard similarity is not reliable here.
thanks for your reply~
the results looks like strange,Why can sequences with length 1914830 be fully aligned to sequences with length 2645 ?