What steps will reproduce the problem?
1. Take a 454PairAlign.txt file that contains reads that are mapped to multiple
locations, and use that to create a sam file with Newbler2SAM (1.0b2-dev):
glu seq.Newbler2SAM --unaligned=drop -o chr3.sam 454PairAlign.txt
../sff/LOTOFREADS.sff
2. If the read is mapped to multiple locations without any other read in
between in the 454PairAlign.txt file, it sorts them longest to shortest
3. Some programs like the IGV viewer cannot load the resulting sam file as it
needs the reads sorted on start position
What is the expected output? What do you see instead?
I expect data sorted on start position, and I see data that is swapped on some
locations.
What version of the product are you using? On what operating system?
glu-genetics 1.0b2-dev on ubuntu 10.04
Please provide any additional information below.
To fix this is I build in an additional sorting step to get the data back in
its original order (see patch, I added also some addiotional output to report
activity when handling large files). However I don't know wheter people prefer
to have the longest read of a set of subsequent multiple mapped reads in front.
Original issue reported on code.google.com by bratdak...@gmail.com on 11 Oct 2010 at 3:10
Original issue reported on code.google.com by
bratdak...@gmail.com
on 11 Oct 2010 at 3:10Attachments: