guanchangge / mosaik-aligner

Automatically exported from code.google.com/p/mosaik-aligner
0 stars 0 forks source link

Mosaik failed to align reads to a reference genome correctly, but shuffling the chromosomes in the reference genome solved the problem #105

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?

 I have two sets of reads generated by two different technologies from the genome of a particular specie. I also have the genome of a similar specie as the reference genome.
1. I used MosaikBuild to convert the formats.
2. I used MosaikAligner to align each set of reads to the reference genome 
separately.
3. I used MosaikCoverage to see the coverage profiles.

What is the expected output? What do you see instead?

The problem is that there is a big drop between mean coverage of the first 8 
chromosomes and other chromosomes. The mean coverage is higher than what it 
should be for the first group and far lower than the expected value for the 
second group.
When I changed the order of chromosomes in the reference genome, the problem 
resolved.

What version of the product are you using? On what operating system?

Mosaik-1.0.1388 / GNU/Linux Debian

Please provide any additional information below.

I repeated the process from the beginning with different input parameters, but 
I got the same result. Then, I aligned the second set of reads to the same 
reference genome and again, I had a weird drop in the mean coverage after 
chromosome 8. 
To resolve the problem, I changed the order of chromosome in the reference 
genome and repeated the process from the beginning, using the previous set of 
input parameters. Surprisingly, this time Mosaik aligned the reads more 
uniformly and the result was similar to what I expected for the both sets of 
reads.
One of the input sets contains almost 30 million single-end 76bp reads.

Original issue reported on code.google.com by ham...@gmail.com on 30 Jul 2011 at 1:45