humanlongevity / HLA

xHLA: Fast and accurate HLA typing from short read sequence data
Other
103 stars 52 forks source link

Massive memory issue #13

Open bcal94 opened 7 years ago

bcal94 commented 7 years ago

Occasionally even after running BAMS through get-reads-alt-unmap.sh XHla still uses upwards of 500gb of ram. Is there any further recommendations for pre processing?

tanghaibao commented 7 years ago

@bcal94 Did the memory issue occur during get-reads-alt-unmap.sh or during the HLA step?

bcal94 commented 7 years ago

during the HLA step

tanghaibao commented 7 years ago

I have not yet seen this before - if you can pin down at which sub-step during HLA this occurs, it might help me diagnose. And please feel free to send me the screen output.

bcal94 commented 6 years ago

27/Dec/2017 19:34:46] INFO - Xie Chao's HLA typing algorithm [27/Dec/2017 19:34:46] INFO - Sample_id: sliced_TCGA-BH-A0HL-01A-11W-A050-09_IlluminaGA-DNASeq_exome_gdc_realn.bam.filtered.bam Input file: sliced_TCGA-BH-A0HL-01A-11W-A050-09_IlluminaGA-DNASeq_exome_gdc_realn.bam.filtered.bam typer.sh parameters: DELETE=false FULL=false Extracting reads from S3 Aligning reads to IMGT database processing FASTQ file found 7560 reads processing MSA file found 26381 HLA exons processing FASTQ file matched to 21689 HLA exons 356 reads matched to HLA types not in the MSA file translating matches to MSA Typing [1] 169323 [1] 169323 [1] 1949 [1] 249 [1] 132078 [1] 98658 [1] 73502 [1] 25156 dcast done dcast weight done weight set to 1 done set to 1 dcast2 done dcast2 lpsolve done lpsolve [1] "A02:01" "A30:01" "B15:50" "B38:01" "C07:01"
[6] "DPB1
02:01" "DQB106:02" "DQB106:03" "DRB113:01" "DRB114:01" pulling non-core exons in DRB114:01 "DRB114:54" B18:01 B18:01 B18:01 B18:01 B18:01 B18:01 B18:01 B18:01 "B57:01" "B57:02" "B57:03" "B57:05" "B57:11" "B13:04" "B57:29" "B57:04" B18:01 B18:01 "B57:48" "B57:71" B18:01 B18:01 B18:01 B18:01 B18:01 B18:01 B18:01 B18:01 "B57:01" "B57:02" "B57:03" "B57:05" "B57:11" "B13:04" "B57:29" "B57:04" B18:01 B18:01 "B57:48" "B57:71" Error in rbindlist(l, use.names, fill, idcol) : Item 4 of list input is not a data.frame, data.table or list Calls: do.call -> -> rbind -> -> rbindlist In addition: Warning message: In mclapply(solution, function(s) { : scheduled cores 4 encountered errors in user code, all values of the jobs will be affected Execution halted Warning message: system call failed: Cannot allocate memory Traceback (most recent call last): File "/apps/xhla/10.04.2017/bin/run.py", line 64, in check_call(bin_args) File "/apps/python/2.7.11/lib/python2.7/subprocess.py", line 540, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/apps/xhla/10.04.2017/bin/typer.sh', 'sliced_TCGA-BH-A0HL-01A-11W-A050-09_IlluminaGA-DNASeq_exome_gdc_realn.bam.filtered.bam', 'sliced_TCGA-BH-A0HL-01A-11W-A050-09_IlluminaGA-DNASeq_exome_gdc_realn.bam.filtered.bam']' returned non-zero exit status 1 77.49user 7.26system 2:09.88elapsed 65%CPU (0avgtext+0avgdata 311016maxresident)k 10544inputs+0outputs (64major+559183minor)pagefaults 0swaps

Im running with 500gb of ram with partial bam containing only chromosome 6 and have already been filtered with your provided script.

tanghaibao commented 6 years ago

@bcal94 Thanks for sending this along - I'll take a look when I get back to work this week.

Haibao