Closed madeluis closed 4 years ago
Hi,I am meeting with the same problem, have you resolve it?
Hello,
I was able to reproduce your problem
$ bsmapz -a SRR6328781_1.fastq -b SRR6328781_2.fastq -d WCG_genome_v2.fa -o SRR6328781.bam -p 64 -A AGATCGGAAGAGC -w 100 -r 0 -q 10
[bsmapz] @Wed Aug 5 15:11:59 2020 loading reference file: WCG_genome_v2.fa (format: FASTA)
Segmentation fault
and solved it by splitting reference sequence into lines of 70 characters with FASTA-formatter.
$ fasta_formatter -i WCG_genome_v2.fa -o WCG_genome_v2_70w.fa -w 70
$ ./bsmapz -a SRR6328781_1.fastq -b SRR6328781_2.fastq -d WCG_genome_v2_70w.fa -o SRR6328781.bam -p 64 -A AGATCGGAAGAGC -w 100 -r 0 -q 10
[bsmapz] @Wed Aug 5 15:14:40 2020 loading reference file: WCG_genome_v2_70w.fa (format: FASTA)
[bsmapz] @Wed Aug 5 15:14:57 2020 12 reference seqs loaded, total size 404611775 bp. 17 secs passed
[bsmapz] @Wed Aug 5 15:15:11 2020 create seed table. 31 secs passed
[bsmapz] @Wed Aug 5 15:15:11 2020 Pair-end alignment(64 threads),
Input read file #1: SRR6328781_1.fastq (format: FASTQ)
Input read file #2: SRR6328781_2.fastq (format: FASTQ)
Output file: SRR6328781.bam (format: SAM, automatically convert to BAM)
You can use another width besides 70, BSMAPz just can't handle the whole chromosome on a single line.
Hello,
I was able to reproduce your problem
$ bsmapz -a SRR6328781_1.fastq -b SRR6328781_2.fastq -d WCG_genome_v2.fa -o SRR6328781.bam -p 64 -A AGATCGGAAGAGC -w 100 -r 0 -q 10 [bsmapz] @Wed Aug 5 15:11:59 2020 loading reference file: WCG_genome_v2.fa (format: FASTA) Segmentation fault
and solved it by splitting reference sequence into lines of 70 characters with FASTA-formatter.
$ fasta_formatter -i WCG_genome_v2.fa -o WCG_genome_v2_70w.fa -w 70 $ ./bsmapz -a SRR6328781_1.fastq -b SRR6328781_2.fastq -d WCG_genome_v2_70w.fa -o SRR6328781.bam -p 64 -A AGATCGGAAGAGC -w 100 -r 0 -q 10 [bsmapz] @Wed Aug 5 15:14:40 2020 loading reference file: WCG_genome_v2_70w.fa (format: FASTA) [bsmapz] @Wed Aug 5 15:14:57 2020 12 reference seqs loaded, total size 404611775 bp. 17 secs passed [bsmapz] @Wed Aug 5 15:15:11 2020 create seed table. 31 secs passed [bsmapz] @Wed Aug 5 15:15:11 2020 Pair-end alignment(64 threads), Input read file #1: SRR6328781_1.fastq (format: FASTQ) Input read file #2: SRR6328781_2.fastq (format: FASTQ) Output file: SRR6328781.bam (format: SAM, automatically convert to BAM)
You can use another width besides 70, BSMAPz just can't handle the whole chromosome on a single line.
Good, I've got it. Thank a lot!
Hello,
I am trying to run BSMAPz with some watermelon data. The reference genome that I am using can be found here:
ftp://cucurbitgenomics.org/pub/cucurbit/genome/watermelon/WCG/v2/
My strong preference is to use the chromosome FASTA file as my reference (WCG_genome_v2.fa, from the link above). However, using this file leads to a segmentation error. If I use the scaffold reference (WCG_scaffold_v2.fa) though, my command runs just fine. My first guess was that this was caused by a memory issue (i.e., in the first case, the program tries to load the entire chromosome at once and it cannot allocate that much memory, which is not a problem with the scaffolds due to their smaller size). However, I find this hard to believe since am using an instance with 768GB of memory, and the problem persists even if I run it with one core.
Any insights into this would be greatly appreciated (command and error below).
Thank you very much! Angels
Command: bsmapz -a ./data/SRR6328781_1.fastq -b ./data/SRR6328781_2.fastq -d ./data/WCG_genome_v2.fa -o SRR6328781.bam -p 8 -A AGATCGGAAGAGC -w 100 -r 0 -q 10
Error: loading reference file: ./data/WCG_genome_v2.fa (format: FASTA) Segmentation fault (core dumped)