hsinnan75 / MapCaller

MapCaller – An efficient and versatile approach for short-read alignment and variant detection in high-throughput sequenced genomes
MIT License
30 stars 5 forks source link

Identify all variants... Segmentation fault (core dumped) #3

Closed tseemann closed 4 years ago

tseemann commented 4 years ago

Can you add a -debug option to help find the problem?

Load the genome index files...
Load the reference sequence (1 chromosome, total size = 5851561 bp)...
Initialize the alignment profile...
All the 1230194 paired-end reads have been processed in 6 seconds.
      995425 ( 80.91%) reads are mapped to the reference genome.
        Est. AvgCoverage = 35
      386496 ( 31.41%) reads are mapped as paires.
        Est. fragment size = 464, insert size = -110
Identify all variants...
Segmentation fault (core dumped)
tseemann commented 4 years ago

You can reproduce this error using the following public data:

MapCaller -t 36 -f SRR4324922_1.fastq.gz -f2 SRR4324922_2.fastq.gz -i CP012885.fna
hsinnan75 commented 4 years ago

Thank you! I've tried the files, but I did not get the error.

Load the genome index files... Load the reference sequence (1 chromosome, total size = 5852822 bp)... Initialize the alignment profile... All the 1230194 paired-end reads have been processed in 9 seconds. 993561 ( 80.76%) reads are mapped to the reference genome. Est. AvgCoverage = 35 387674 ( 31.51%) reads are mapped in pairs. Est. fragment size = 464, insert size = -110 Identify all variants... Write all the predicted sample variations to file [output.vcf]... 4288(snp); 59(ins); 56(del); 0(trans); 0(inversion) variant calling has been done in 0 seconds.

Could you please send me your fasta? Thank you!

tseemann commented 4 years ago

Yes you seem to have a different sized reference - I was using CP012885.1 you used CP012885.2 but I don't think that's the problem.

Mine had this header:

>gi|1071930522|gb|CP012885.1| Mycobacterium chimaera strain AH16, complete genome

I will try one with this header:

>CP012885.2 Mycobacterium chimaera strain AH16 chromosome, complete genome

Let's run:

bwt_index CP012885.2.fa ref[bwt_index] Pack FASTA... 0.04 sec
[bwt_index] Construct BWT for the packed sequence...
[BWTIncCreate] textLength=11705644, availableWord=12823584
[bwt_gen] Finished constructing BWT in 6 iterations.
[bwt_index] 1.96 seconds elapse.
[bwt_index] Update BWT... 0.03 sec
[bwt_index] Pack forward-only FASTA... 0.02 sec
[bwt_index] Construct SA from BWT and Occ... 0.61 sec

MapCaller -t 72 -f SRR4324922_1.fastq.gz -f2 SRR4324922_2.fastq.gz -i ref
Load the genome index files...
Load the reference sequence (1 chromosome, total size = 5852822 bp)...
Initialize the alignment profile...
All the 1230194 paired-end reads have been processed in 6 seconds.
      993566 ( 80.76%) reads are mapped to the reference genome.
        Est. AvgCoverage = 35
      387682 ( 31.51%) reads are mapped as paires.
        Est. fragment size = 464, insert size = -110
Identify all variants...
Segmentation fault (core dumped)

I compiled on gcc 5.5.0 ; what are you using?

tseemann commented 4 years ago

Ok i compiled with gcc 9.2 and it works...

        Write all the predicted sample variations to file [output.vcf]...
        4288(snp); 59(ins); 56(del); 0(trans); 0(inversion)
variant calling has been done in 0 seconds.
All done! It took 6 seconds to complete the data analysis.

Also works with gcc 8 and gcc 7

tseemann commented 4 years ago

I think the real problem is #6