duncanca / mosaik-aligner

Automatically exported from code.google.com/p/mosaik-aligner
0 stars 0 forks source link

MosaikSort error - when determining whether to apply mate-pair or paired-end constraints, an irregularity in the alignment model counts was discovered. #95

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. MoasikBuild
2. MosaikAligner
3. MosaikSort <- crashed

What is the expected output? a sorted file
MosaikSort  1.1.0021  2010-11-10
Michael Stromberg & Wan-Ping Lee  Marth Lab, Boston College Biology Department
---------------------------------------------------
- resolving the following types of read pairs: [unique vs unique] [unique vs 
multiple] 

- phase 1 of 3: building fragment length distribution:
samples: 1,000,000 (39,120.6 samples/s)

- resolving paired-end alignments

- phase 2 of 3: resolve read pairs:
100%[=============]  26,830.9 reads/s       in 21:20  

- phase 3 of 3: sort resolved read pairs:
100%[==============] 253,691.1 alignments/s       in 01:47  

Paired-end read statistics:
=========================
 original             resolved   
--------------------------------------------------------------
# orphaned:         4661983 (13.6 %)           0 ( 0.0 %)
# both mates unique:   18809755 (54.8 %)    12010020 (35.0 %)
# one mate non-unique:   6531289 (19.0 %)    1592264 ( 4.6 %)
# both mates non-unique:  4348737 (12.7 %)     0 ( 0.0 %)
-------------------------------
total:                34351764           13602284 (39.6 %)

Fragment statistics:
================================
min target frag len:          57
median target frag len:      194
max target frag len:         472

MosaikSort CPU time: 1426.290 s, wall time: 1455.384 s

What do you see insted?
-----------------------------------------
MosaikSort 1.1.0021  2010-11-10
Michael Stromberg & Wan-Ping Lee  Marth Lab, Boston College Biology Department
----------------------------------------------------------

- resolving the following types of read pairs: [unique vs unique] [unique vs 
multiple] 

- phase 1 of 3: building fragment length distribution:
samples: 1,000,000 (35,014.0 samples/s)

ERROR: When determining whether to apply mate-pair or paired-end constraints, 
an irregularity in the alignment model counts was discovered.

Normal mate-pair data sets have the highest counts for
alignment models:  4 & 5.
Normal paired-end data sets have the highest counts for alignment models: 2 & 6.
Normal solid-end data sets have the highest counts for alignment models: 1 & 8.

       We expect that the ratio of the 6 lowest counts to the 2 highest counts to be no larger than 0.10, but in this data set the ratio was 0.12

- alignment model 6:     81411 hits
- alignment model 2:     74197 hits
- alignment model 5:      5101 hits
- alignment model 8:      4738 hits
- alignment model 4:      2622 hits
- alignment model 3:      2394 hits
- alignment model 1:      2336 hits
- alignment model 7:      1895 hits

What version of the product are you using? 
MosaikSort 1.1.0021     2010-11-10                   

On what operating system?
Linux - Red Hat

Please provide any additional information below.
Below is the syntax for all of our 25 samples only M4 crashed,
while running MosaikSort (see error message above).

MosaikBuild -q M4_1.fq -q2 M4_2.fq -out M4_hg18.bin -cn hudson -st sanger -sam 
M4 
MosaikAligner -in M4_hg18.bin -out M4_hg18.aligned.bin -ia hg18.fa.bin -j 
hg18_hs15.MosaikJumpDb -hs 15 -mm 4 -act 20 -mhp 100 -m all -bw 13 -p 20
MosaikSort -in M4_hg18.aligned.bin -out M4_hg18.aligned.bin.srt -iuo

Original issue reported on code.google.com by britt...@gmail.com on 11 Mar 2011 at 6:14

GoogleCodeExporter commented 8 years ago
same here, mapping illumina data on yeast with the same version.
As a workaround I increased mapping stringency, resulting in less mapped reads, 
and thus somehow shifted the "model ratio"s. Not really happy with this.

System: home-made linux, 64bit

I do not remember command line(s), as this happens a few weeks ago. I just was 
not sure if this is an error or feature ;-)

Original comment by sir.sven...@gmail.com on 29 Mar 2011 at 12:54