wanpinglee / MOSAIK

reference-guided aligner for next-generation sequencing technologies
http://code.google.com/p/mosaik-aligner/
55 stars 19 forks source link

MOSAIK segmentation fault #14

Open kspham opened 9 years ago

kspham commented 9 years ago

Thanks for developing MOSAIK, I tried on a human dataset and get a Segmentation fault. Any hints? Thank you!

/home/snow/bin/MOSAIK/bin/MosaikAligner -in LID115260_MERGE1.mkb -out ../bamfiles/LID115260_MERGE1.mka -ia /home/snow/bin/gkno_launcher/resources/homo_sapiens/current//human_reference_v37.dat -j /home/snow/bin/gkno_launcher/resources/homo_sapiens/current//human_reference_v37_15 -annpe /home/snow/bin/gkno_launcher/resources/homo_sapiens/current//pe.100.01.ann -annse /home/snow/bin/gkno_launcher/resources/homo_sapiens/current//se.100.005.ann -p 32

MosaikAligner 2.2.30 2014-06-27

Wan-Ping Lee & Michael Stromberg Marth Lab, Boston College Biology Department

Aligning read library (572809234): 0% [ ] |Segmentation fault (core dumped) snow@ubuntu:/raid/sonpham/schipairs/originalbam/tangram/mosaikfastq$

markziemann commented 9 years ago

I also have a segfault. I'm running Ubuntu 14.04.1 LTS 64 bit

MosaikBuild -fr Oryza_sativa_mirbase21_LEN21_SETP1.fasta -assignQual 30 -st illumina -out Oryza_sativa_mirbase21_LEN21_SETP1.fasta.mos

MosaikJump -ia /data/projects/mziemann/microRNA_aligners/refgenomes/Oryza_sativa_mosaik/Oryza_sativa.IRGSP-1.0.26.dna_sm.toplevel.fa.ref -out /data/projects/mziemann/microRNA_aligners/refgenomes/Oryza_sativa_mosaik/Oryza_sativa.IRGSP-1.0.26.dna_sm.toplevel.fa.ref.jump -hs 15

MosaikAligner -in Oryza_sativa_mirbase21_LEN21_SETP1.fasta.mos -out Oryza_sativa_mirbase21_LEN21_SETP1.fasta.mosaik -ia /data/projects/mziemann/microRNA_aligners/refgenomes/Oryza_sativa_mosaik/Oryza_sativa.IRGSP-1.0.26.dna_sm.toplevel.fa.ref -j /data/projects/mziemann/microRNA_aligners/refgenomes/Oryza_sativa_mosaik/Oryza_sativa.IRGSP-1.0.26.dna_sm.toplevel.fa.ref.jump -annse /data/projects/mziemann/app/MOSAIK/src/networkFile/2.1.78.se.ann -annpe /data/projects/mziemann/app/MOSAIK/src/networkFile/2.1.78.pe.ann


MosaikAligner 2.2.0 2015-03-30

Wan-Ping Lee & Michael Stromberg Marth Lab, Boston College Biology Department

Aligning read library (65993): 0% [ ] |Segmentation fault (core dumped)

kulvait commented 9 years ago

The problem on my computer with Segmentation fault was caused by queryString.append( al.Query.CData() );

==17765== Invalid read of size 16 ==17765== at 0x512771: strlen (in /x/utils/MOSAIK/bin/MosaikAligner) ==17765== by 0x4C5C8B: std::string::append(char const) (in /x/utils/MOSAIK/bin/MosaikAligner) ==17765== by 0x40AA85: CAlignmentThread::SetRequiredInfo(Alignment&, CAlignmentThread::AlignmentStatusType const&, Alignment&, Mosaik::Mate const&, Mosaik::Read const&, bool const&, bool const&, bool const&, bool const&, bool const&, bool const&) (AlignmentThread.cpp:1342) ==17765== by 0x40EA02: CAlignmentThread::AlignReadArchive(MosaikReadFormat::CReadReader, MosaikReadFormat::CAlignmentWriter, unsigned long, bool, CStatisticsMaps, CAlignmentThread::BamWriters, unsigned char) (AlignmentThread.cpp:1048) ==17765== by 0x4100A2: CAlignmentThread::StartThread(void*) (AlignmentThread.cpp:132) ==17765== by 0x47CCCF: start_thread (pthread_create.c:304) ==17765== by 0x53E898: clone (in /x/utils/MOSAIK/bin/MosaikAligner) ==17765== Address 0x0 is not stack'd, malloc'd or (recently) free'd

holtgrewe commented 9 years ago

I'm getting a crash using MosaikBuild.

alexstaj commented 8 years ago

so how did you fix this?

bede commented 8 years ago

I'm now getting a segfault running MosaikAligner on a small number of simulated reads. Under both Mac OS X and Ubuntu amd64:

$ ./MosaikBuild -fr ../reads/bac/mycoplasma_genitali.fasta -oa mycoplasma_genitali.mskref
------------------------------------------------------------------------------
MosaikBuild 2.2.30                                                  2014-06-27
Wan-Ping Lee & Michael Stromberg  Marth Lab, Boston College Biology Department
------------------------------------------------------------------------------

- converting ../reads/bac/mycoplasma_genitali.fasta to a reference sequence archive.

- parsing reference sequences:
ref seqs: 1 (2.00 ref seqs/s)

- writing reference sequences:
100%[===================================================================================]      1.00 ref seqs/s        in  1 s  

- calculating MD5 checksums:
100%[===================================================================================]      1.00 ref seqs/s        in  1 s  

- writing reference sequence index:
100%[===================================================================================]      1.00 ref seqs/s        in  1 s  

- creating concatenated reference sequence:
100%[===================================================================================]      1.00 ref seqs/s        in  1 s  

- writing concatenated reference sequence...        finished.
- creating concatenated 2-bit reference sequence... finished.
- writing concatenated 2-bit reference sequence...  finished.

MosaikBuild CPU time: 0.021 s, wall time: 2.012 s

$ ./MosaikBuild -st 454 -q ../curesim/sim_mycoplasma_genitali_ind0_sub0.fastq -out sim_mycoplasma_genitali_ind0_sub0.mskreads
------------------------------------------------------------------------------
MosaikBuild 2.2.30                                                  2014-06-27
Wan-Ping Lee & Michael Stromberg  Marth Lab, Boston College Biology Department
------------------------------------------------------------------------------

- setting read group ID to: ZTK0O1H98OS
- setting sample name to: unknown
- setting sequencing technology to: 454

- parsing FASTQ file:
reads: 50,000 (49,950.0 reads/s)

Filtering statistics:
============================================
# reads written:             50000
# bases written:          19966537

MosaikBuild CPU time: 1.224 s, wall time: 1.309 s

$ ./MosaikAligner -ia mycoplasma_genitali.mskref -in sim_mycoplasma_genitali_ind0_sub0.mskreads -out mycoplasma_genitali_ind0_sub0.mosaik -annse 2.1.78.se.ann -annpe 2.1.78.pe.ann
------------------------------------------------------------------------------
MosaikAligner 2.2.30                                                2014-06-27
Wan-Ping Lee & Michael Stromberg  Marth Lab, Boston College Biology Department
------------------------------------------------------------------------------

- Using the following alignment algorithm: all positions
- Using the following alignment mode: aligning reads to all possible locations
- Using a maximum mismatch percent threshold of 0.15
- Using a Smith-Waterman bandwidth of 151
- Using an alignment candidate threshold of 55bp.
- Setting hash position threshold to 200
- Using a homo-polymer gap open penalty of 4
- loading reference sequence... finished.

Hashing reference sequence:
100%[==================================================================================] 386,193.1 ref bases/s        in  1 s  

Aligning read library (50000):
 0% [                                                                                      ]                                  \Segmentation fault (core dumped)

Any ideas? We'll have to exclude MOSAIK from our study unless we can run it.