mummer4 / mummer

Mummer alignment tool
Artistic License 2.0
433 stars 108 forks source link

dnadiff output questions #177

Open pc395 opened 2 years ago

pc395 commented 2 years ago

Hi, I recently ran dnadiff on 2 highly similar sequences of Mycobacterium tuberculosis. Both sequences were single contigs. After running the program I was looking at the out.rdiff and out.qdiff files for the specific differences between the sequences and I noticed that the coordinates for most of the differences were written higher position, lower position but a few were written lower position, higher position (see below). I was wondering if someone could explain what is going on and what the significance of the order is? Also, what are the other numbers next to the difference? As in, why is the difference ~20,000 bases for one, but the last column just says -14? The actual sequence sizes only vary by about 6kb.

NC_000962.3 JMP 1633539 1633530 -8 NC_000962.3 JMP 1987086 1987080 -5 NC_000962.3 GAP 2634134 2632914 -1219 845 -2064 NC_000962.3 GAP 3336803 3336503 -299 -123 -176 NC_000962.3 GAP 3425197 3405195 -20001 -19987 -14 NC_000962.3 JMP 3691071 3690950 -120 NC_000962.3 GAP 3732793 3730350 -2442 -714 -1728 NC_000962.3 GAP 3789690 3769681 -20008 -20017 9 NC_000962.3 GAP 3936416 3934870 -1545 -918 -627 NC_000962.3 GAP 3948004 3947880 -123 -15 -108

Any help would be appreciated! Thanks