I'm new to mummer (am running v4.0.0rc1) and tried it out on a small randomly-generated single chromosome genome. Where I was expecting either one long alignment or a few shorter non-overlapping alignments, I instead got two alignments that have a long path in common. Is this intended?
My expectation was that either the entirety of the two sequences would be reported as a single alignment. Or, failing that, that I'd see that split into a few smaller non-overlapping alignments.
Instead, what I see is two alignments. The first covers the entirety of the two sequences as I expected. The second begins around 2.7K (following a couple nearby deletions in ORANGE1) and traverse what appears to be the same alignment path as the other alignment. So two alignments with about 95Kbp in common.
That seems like strange behavior for an aligner. Did I do something wrong? Or am I mis-interpreting the output? Is 5% divergence too high?
My initial test actually had a genome with 5 chromosome pairs, and I had similar problem on 3 chromosomes. On one chromosome it split into two non-overlapping alignments (probably reasonable). On the remaining chromosome it had two alignments but they appear to overlap by about 40% (though I have not checked whether that is true base-by-base).
(FWIW, I'm the author of lastz)
I'm new to mummer (am running v4.0.0rc1) and tried it out on a small randomly-generated single chromosome genome. Where I was expecting either one long alignment or a few shorter non-overlapping alignments, I instead got two alignments that have a long path in common. Is this intended?
The two sequences can be found at https://docs.google.com/document/d/1HUW0ocUHytplvUMb9BabDHFr4AJoELvQbT1bpB1FRKc One is a random sequence, the other was created by simulating substitutions and indels (no rearrangements or duplications). Identity between these sequences should be ≈ 95%.
I ran
My expectation was that either the entirety of the two sequences would be reported as a single alignment. Or, failing that, that I'd see that split into a few smaller non-overlapping alignments.
Instead, what I see is two alignments. The first covers the entirety of the two sequences as I expected. The second begins around 2.7K (following a couple nearby deletions in ORANGE1) and traverse what appears to be the same alignment path as the other alignment. So two alignments with about 95Kbp in common.
That seems like strange behavior for an aligner. Did I do something wrong? Or am I mis-interpreting the output? Is 5% divergence too high?
My initial test actually had a genome with 5 chromosome pairs, and I had similar problem on 3 chromosomes. On one chromosome it split into two non-overlapping alignments (probably reasonable). On the remaining chromosome it had two alignments but they appear to overlap by about 40% (though I have not checked whether that is true base-by-base).