ExaScience / elprep

elPrep: a high-performance tool for analyzing sequence alignment/map files in sequencing pipelines.
Other
287 stars 40 forks source link

panic: runtime error: index out of range #19

Closed pwwang closed 5 years ago

pwwang commented 5 years ago

I was trying to run elprep with most of the filters on:

2019/01/31 09:59:09 Executing command:
 elprep filter /data/input/DX123_HFJ5KDSXX_L2.sam /data/output/DX123_HFJ5KDSXX_L2.bam --mark-duplicates --mark-optical-duplicates /data/output/DX123_HFJ5KDSXX_L2.opticaldups.txt --optical-duplicates-pixel-distance 100 --remove-duplicates --bqsr /data/output/DX123_HFJ5KDSXX_L2.bqsr.txt --bqsr-reference /data/reference/hg19/ucsc_hg19.fa.elprep --quantize-levels 0 --sorting-order coordinate --nr-of-threads 30 --log-path /data/output
panic: runtime error: index out of range
goroutine 166040 [running]:
runtime/debug.Stack(0xdcc8256380, 0x0, 0xc00011e700)
    /opt/local/lib/go/src/runtime/debug/stack.go:24 +0xa7
github.com/exascience/pargo/internal.WrapPanic(0x5aa000, 0x741450, 0x741450, 0xe1cf34ab)
    /Users/caherzee/go/pkg/mod/github.com/exascience/pargo@v1.0.0/internal/internal.go:41 +0x45
github.com/exascience/pargo/parallel.RangeReduce.func1.1.1(0xd559ba65a0, 0xd55ce65a50)
    /Users/caherzee/go/pkg/mod/github.com/exascience/pargo@v1.0.0/parallel/parallel.go:803 +0x43
panic(0x5aa000, 0x741450)
    /opt/local/lib/go/src/runtime/panic.go:513 +0x1b9
github.com/exascience/elprep/v4/filters.computeSnpEvents(0xd0c194a3c0, 0x0, 0x0, 0x0, 0xd0c1933800, 0x1e, 0x100, 0x4, 0x1e, 0x100)
    /Users/caherzee/Documents/Work/Code/elprep/filters/bqsr.go:326 +0x3b3
github.com/exascience/elprep/v4/filters.(*BaseRecalibrator).Recalibrate.func2(0x457055b, 0x469da0b, 0xc00004a220, 0xc00004aca0)
    /Users/caherzee/Documents/Work/Code/elprep/filters/bqsr.go:952 +0x65a
github.com/exascience/pargo/parallel.RangeReduce.func1(0x457055b, 0x469da0b, 0x1, 0xd55ce65a50, 0x96)
    /Users/caherzee/go/pkg/mod/github.com/exascience/pargo@v1.0.0/parallel/parallel.go:789 +0x2ba
github.com/exascience/pargo/parallel.RangeReduce.func1.1(0xd559ba65a0, 0xd55ce65a50, 0xd0c21c34e8, 0x457055b, 0x469da0b, 0x2, 0x1, 0xd559ba6590)
    /Users/caherzee/go/pkg/mod/github.com/exascience/pargo@v1.0.0/parallel/parallel.go:806 +0x83
created by github.com/exascience/pargo/parallel.RangeReduce.func1
    /Users/caherzee/go/pkg/mod/github.com/exascience/pargo@v1.0.0/parallel/parallel.go:801 +0x1d2

The sam file was created by mapping raw reads to reference using bwa Reference was generated by elprep fasta-to-elfasta

>uname -a
Linux hostname 2.6.32-696.28.1.el6.x86_64 #1 SMP Wed May 9 23:09:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Any ideas?

UPDATE: It seemed there is something wrong with snp event computing, so I tried to pass to --known-sites with elsites compiled from dbsnp by elprep vcf-to-elsites, which didn't help.

caherzee commented 5 years ago

Hi,

Could you retry with adding the option --known-sites to your call? This option is used to add a list of known polymorphic sites for base quality score recalibration (--bqsr). Thanks!

pwwang commented 5 years ago

@caherzee Thanks for the quick reply. Yes, see my UPDATE later in the post, I did add the --known-sites option, but with no luck. It ended up with the same errors.

caherzee commented 5 years ago

Hi,

Thanks!

pwwang commented 5 years ago
> elprep --version

 elprep version 4.1.1 compiled with go1.11.5 - see http://github.com/exascience/elprep for more information.

> bwa 

Program: bwa (alignment via Burrows-Wheeler transformation)
Version: 0.7.15-r1140
Contact: Heng Li <lh3@sanger.ac.uk>

I think I found the problem.
I was dealing with sequencing data from a PDX model with mixed human tumor tissues and mouse tissues. So I aligned the reads against a combined reference genome, and removed the reads aligned to the mouse genome. However, when I tried to run elprep filter, I think I forgot to switch the reference back to a pure human reference genome.
Now everything works fine with the right reference genome.

Thank you so much for your help! -- This is a life-saving tool to prepare the call-ready bam files with multithreading available, compared to GATK!

caherzee commented 5 years ago

That is good to know :) Thank you for the kind words!