GELOG / docker-ubuntu-avocado

(Genomics) Dockerfile for running Avocado - http://bdgenomics.org
Apache License 2.0
0 stars 1 forks source link

OutOfMemoryError #2

Open flangelier opened 9 years ago

flangelier commented 9 years ago

Error:

2015-03-22 17:29:25 ERROR Executor:96 - Exception in task 0.0 in stage 1.0 (TID 16) java.lang.OutOfMemoryError: GC overhead limit exceeded

Step to reproduce:

docker run -ti --rm --name client-genomics -v /data:/data gelog/avocado /bin/bash avocado-submit /data/SRR062634.adam /data/chr1.fa /data/SRR062634.avr /usr/local/avocado/avocado-sample-configs/basic.properties

You can get the files by following the step from :

sebastienbonami commented 9 years ago

Have you try reducing the executor and driver memory in Spark? See: https://github.com/GELOG/adamcloud/issues/11#issuecomment-85769275

davidonlaptop commented 9 years ago

How did you generate SRR062634.adam?

flangelier commented 9 years ago

If order, you have to :

Getting the index

mkdir -p /data/ wget -O /data/chr1.fa.gz http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chr1.fa.gz gzip -d /data/chr1.fa.gz

Indexing with snap

docker run --rm=true -ti -v /data:/data gelog/snap index /data/chr1.fa /data/snap-index.chr1

Getting the chromosome

wget -O /data/SRR062634.filt.fastq.gz ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data/HG00096/sequence_read/SRR062634.filt.fastq.gz gzip -d /data/SRR062634.filt.fastq.gz

Aligning the chromosome

docker run --rm=true -ti -v /data:/data gelog/snap single /data/snap-index.chr1/ /data/SRR062634.filt.fastq -o /data/SRR062634.sam

Running adam

docker run --rm=true -ti -v /data/:/data gelog/adam adam-submit transform /data/SRR062634.sam /data/SRR062634.adam

You should now have all the needed files

davidonlaptop commented 9 years ago

The file SRR062634.filt.fastq.gz file contains the bogus reads that have been filtered out after quality control.

Try to feed this file to ADAM: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data/HG00096/alignment/HG00096.chrom20.ILLUMINA.bwa.GBR.low_coverage.20120522.bam