a2iEditing / RNAEditingIndexer

A tool for the calculation of RNA-editing index for RNA seq data
Other
34 stars 18 forks source link

docker run #5

Open cottrellka opened 4 years ago

cottrellka commented 4 years ago

I'm getting several warnings and issues when running via Docker.

I used the following command to start the run: $ docker run -v ~/Desktop/MB361:/data/bams:ro -v ~/Desktop/MB361:/data/output docker_rep/a2i_editing_index:ccle RNAEditingIndex -d /data/bams -l ~/Desktop/MB361/logs_dir -o ~/Desktop/MB361/cmpileups -os ~/Desktop/MB361/summary_dir -f bam --genome hg19

*I should add here that I think it would be helpful to add "-f bam" to the Docker.README

I am using a .bam file I downloaded from the GDC legacy portal: https://portal.gdc.cancer.gov/legacy-archive/files/9b88b1c7-7862-48dd-b914-922f1dc3e13d

I get the following upon starting the run:

***** WARNING: File /data/bams/MDAMB361.bam has inconsistent naming convention for record: 1 12107 12208 C1FGVACXX130122:7:2108:3543:55312/2 3 +

***** WARNING: File /data/bams/MDAMB361.bam has inconsistent naming convention for record: 1 12107 12208 C1FGVACXX130122:7:2108:3543:55312/2 3 +

Arguments in effect: Input file : /Users/weberlab/Desktop/MB361/cmpileups/MDAMB361./MDAMB361._region_ucscHg19Alu.bed.gz_alignments.bam Output file : /Users/weberlab/Desktop/MB361/cmpileups/MDAMB361./MDAMB361._region_ucscHg19Alu.bed.gz_alignments.bam.trimmed_5.bam

Bases to trim from each side : 5

Number of records read = 0 Number of records written = 0 [mpileup] 1 samples in 1 input files GenerateIndex - Starting! Running: bedtools getfasta -name -bedOut -fi '/bin/AEI/RNAEditingIndexer/Resources/Genomes/HomoSapiens/ucscHg19Genome.fa' -bed '/bin/AEI/RNAEditingIndexer/Resources/Regions/HomoSapiens/ucscHg19Alu.bed.gz' GenerateIndex - Indexing FASTA Records! GenerateIndex - encountered unexpected line from bedtools, skipping line. Line:index file /bin/AEI/RNAEditingIndexer/Resources/Genomes/HomoSapiens/ucscHg19Genome.fa.fai not found, generating... Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.String.substring(String.java:1969) at java.lang.String.subSequence(String.java:2003) at java.util.regex.Pattern.split(Pattern.java:1216) at java.lang.String.split(String.java:2380) at java.lang.String.split(String.java:2422) at EditingIndexJavaUtils.BEDGenomeIndexer.GenerateIndex(BEDGenomeIndexer.java:54) at EditingIndexJavaUtils.BEDGenomeIndexer.main(BEDGenomeIndexer.java:117) at EditingIndexJavaUtils.EditingIndexBEDUtils.main(EditingIndexBEDUtils.java:30) [2019-10-30 15:56:56,934] EIPipelineManger ERROR Process: Editing_Index_PiplineMDAMB361.; Going To Error Step: Step-1 Failed To Run On /data/bams/MDAMB361.bam! [2019-10-30 15:56:56,954] general_functions WARNING GGPSResources.general_functions.remove_files Failed To Remove 30-10-2019-15.cnf [2019-10-30 15:56:56,954] general_functions WARNING GGPSResources.general_functions.remove_files Failed To Remove 30-10-2019-15.cnf

I imagine the first issue might be driving a lot of the subsequent issues. My guess is that the .bed file might have "chr1" for chromosome 1 while the .bam uses "1".

Thanks for your help.

shalomhillelroth commented 4 years ago

Hi,

Sorry for the delayed answer, This appears to be the driving exception "Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded"

Do you have a (RAM) memory limitations for your docker image?

Only the best,

mjsteinbaugh commented 4 years ago

@cottrellka I'm attempting to debug the RNAEditingIndexer Docker image also. Do you have a public image? See related issue #8 for what I'm seeing on my end with Docker.

cottrellka commented 4 years ago

@shalomhillelroth I didn't adjust memory limitations of the Docker image. This is my first time using Docker. A quick web search suggests that by default there is no memory limitations (other than that of the system).

cottrellka commented 4 years ago

@mjsteinbaugh I didn't set it up to be public. I also know very little about Docker so I'm not sure how to go about that.

shalomhillelroth commented 4 years ago

Hi,

Sorry for the late reply, What is your system configuration? More specifically - what are your global RAM limitations? (We might need to work around this problem by minimizing the genome fasta size)

Best, Shalom Hillel Roth

cottrellka commented 4 years ago

@shalomhillelroth I am on a Mac (macOS High Sierra, 10.13.6), processor 2.8 GHz Intel Core i7, 16 GB RAM.