mozack / abra

Assembly Based ReAligner
MIT License
70 stars 12 forks source link

Requested array size exceeds VM limit #11

Closed joonlee3 closed 8 years ago

joonlee3 commented 9 years ago

Hi,

While running ABRA, I've got the following error message. If I specified a larger amount of memory (ex. 32GB), can this problem be resolved?

[main] CMD: bwa samse -n 1000 /scratch/TCGA-DD-A1EA-01A-11D-A12Z-10_Illumina_1410534136/clean_contigs.fasta /scratch/TCGA-DD-A1EA-01A-11D-A12Z-10_Illumina_1410534136/temp1/align_to_contig.sam.sai /scratch/TCGA-DD-A1EA-01A-11D-A12Z-10_Illumina_1410534136/temp1/original_reads.fastq.gz [main] Real time: 3956.359 sec; CPU: 3197.467 sec Stream thread done. Stream thread done. BWA time: 3956 seconds. Clock time in Align to contigs: 19797 Sat Sep 13 00:24:30 EDT 2014 : Adjust reads Sat Sep 13 00:24:30 EDT 2014 : Adjusting reads. java.lang.OutOfMemoryError: Requested array size exceeds VM limit at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuilder.append(StringBuilder.java:132) at net.sf.samtools.SAMTextHeaderCodec.advanceLine(SAMTextHeaderCodec.java:128) at net.sf.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:83) at net.sf.samtools.SAMTextReader.readHeader(SAMTextReader.java:185) at net.sf.samtools.SAMTextReader.(SAMTextReader.java:62) at net.sf.samtools.SAMTextReader.(SAMTextReader.java:71) at net.sf.samtools.SAMFileReader.init(SAMFileReader.java:556) at net.sf.samtools.SAMFileReader.(SAMFileReader.java:167) at net.sf.samtools.SAMFileReader.(SAMFileReader.java:122) at abra.ReadAdjuster.adjustReads(ReadAdjuster.java:55) at abra.AdjustReadsRunnable.go(AdjustReadsRunnable.java:37) at abra.AbraRunnable.run(AbraRunnable.java:19) at java.lang.Thread.run(Thread.java:745)

mozack commented 9 years ago

Could you please post the output of:

wc -l /scratch/TCGA-DD-A1EA-01A-11D-A12Z-10_Illumina_1410534136/clean_contigs.fasta and ls -lh /scratch/TCGA-DD-A1EA-01A-11D-A12Z-10_Illumina_1410534136/

joonlee3 commented 9 years ago

I used your commands for another failed job. Below are what I've got from the commands. If you have any insights/suggestions, please let me know.

Thanks, Joon

wc -l /scratch/TCGA-BC-A216-01A-11D-A152-10_Illumina_TCGA-BC-A216-11A-11D-A152-10_Illumina_1410987440/clean_contigs.fasta

_24965046 /scratch/TCGA-BC-A216-01A-11D-A152-10_Illumina_TCGA-BC-A216-11A-11D-A152-10_Illumina_1410987440/cleancontigs.fasta

ls -lh /scratch/TCGA-BC-A216-01A-11D-A152-10_Illumina_TCGA-BC-A216-11A-11D-A152-10_Illumina_1410987440/

drwxrwxrwx 3 ................ 4.0K Sep 17 23:38 sorttmp/

drwxr-xr-x 2 ................ 4.0K Sep 18 06:14 temp1/

drwxr-xr-x 2 ................ 4.0K Sep 18 06:14 temp2/

drwxr-xr-x 2 ................ 4.0K Sep 17 16:57 unaligned/

_-rw-r--r-- 1 ................ 22 Sep 18 03:21 cleancontigs.fasta.amb

_-rw-r--r-- 1 ................ 2.0G Sep 18 03:21 cleancontigs.fasta.ann

_-rw-r--r-- 1 ................ 2.2M Sep 17 23:38 all_contigs_chimsorted.bai

_-rw-r--r-- 1 ................ 232M Sep 17 23:32 all_contigschim.bam

_-rw-r--r-- 1 ................ 310M Sep 18 00:07 all_contigs_chimchopped.bam

_-rw-r--r-- 1 ................ 234M Sep 17 23:38 all_contigs_chimsorted.bam

_-rw-r--r-- 1 ................ 4.6G Sep 18 03:18 cleancontigs.fasta.bwt

_-rw-r--r-- 1 ................ 5.2G Sep 17 18:55 allcontigs.fasta

_-rw-r--r-- 1 ................ 6.3G Sep 18 00:11 cleancontigs.fasta

_-rw-r--r-- 1 ................ 0 Sep 17 17:05 svcontigs.fasta

_-rw-r--r-- 1 ................ 1.2G Sep 18 03:21 cleancontigs.fasta.pac

_-rw-r--r-- 1 ................ 2.3G Sep 18 03:34 clean_contigs.fasta.sa http://clean_contigs.fasta.sa_

_-rw-r--r-- 1 ................ 6.3G Sep 17 19:36 all_contigs_inregion.sam

_-rw-r--r-- 1 ................ 6.3G Sep 17 19:32 allcontigs.sam

-rw-r--r-- 1 ................ 900K Sep 17 16:57 libAbra.so

On Mon, Sep 15, 2014 at 4:34 PM, Lisle Mose notifications@github.com wrote:

Could you please post the output of:

wc -l /scratch/TCGA-DD-A1EA-01A-11D-A12Z-10_Illumina_1410534136/clean_contigs.fasta and ls -lh /scratch/TCGA-DD-A1EA-01A-11D-A12Z-10_Illumina_1410534136/

— Reply to this email directly or view it on GitHub https://github.com/mozack/abra/issues/11#issuecomment-55654327.

Delight yourself in the LORD and he will give you the desires of your heart. Commit your way to the LORD; trust in him and he will do this: He will make your righteousness shine like the dawn, the justice of your cause like the noonday sun. (Psalms 37:4 - 6)

mozack commented 9 years ago

I suspect this particular sample may not have failed in the same way.

The original problem appears to be due to too many contigs assembled. You may need to experiment with more aggressive graph pruning settings (or try to identify and exclude subregions that are particularly noisy). See --mbq and --mnf

For the second case, could I trouble you to compress and send the log file to lmose at unc dot edu ?

mozack commented 9 years ago

Just to clarify, more RAM won't correct that first problem.