bioinfo-biols / CIRI-cookbook

Document for CIRI-series software
https://ciri-cookbook.readthedocs.io/en/latest/index.html
5 stars 1 forks source link

MemoryError #21

Open Biocard opened 2 years ago

Biocard commented 2 years ago

My memory is 256GB ; however, I got the following message: [Tue 2022-06-21 07:11:12] [INFO ] Input reads: SRR9856190_1_val_1.fq.gz,SRR9856190_2_val_2.fq.gz [Tue 2022-06-21 07:11:12] [INFO ] Library type: unstranded [Tue 2022-06-21 07:11:12] [INFO ] Output directory: /home/biodata/GSE135055_CIRCRNA/SRR9856190_out, Output prefix: SRR9856190 [Tue 2022-06-21 07:11:12] [INFO ] Config: GRCh38 Loaded [Tue 2022-06-21 07:11:12] [INFO ] 40 CPU cores availble, using 35 [Tue 2022-06-21 07:11:12] [INFO ] Align RNA-seq reads to reference genome .. [Tue 2022-06-21 08:02:43] [INFO ] Estimate gene abundance .. [Tue 2022-06-21 08:11:28] [INFO ] No circRNA information provided, run CIRI2 for junction site prediction .. [Tue 2022-06-21 08:11:28] [INFO ] Running BWA-mem mapping candidate reads .. [Tue 2022-06-21 08:24:53] [INFO ] Running CIRI2 for circRNA detection .. [Tue 2022-06-21 09:10:32] [INFO ] Extract circular sequence [Tue 2022-06-21 09:10:55] [100% ] [##################################################] [Tue 2022-06-21 09:10:55] [INFO ] Building circular index .. [Tue 2022-06-21 09:12:55] [INFO ] De novo alignment for circular RNAs .. [Tue 2022-06-21 09:56:22] [INFO ] Detecting reads containing Back-splicing signals [Tue 2022-06-21 09:56:56] [INFO ] Detecting FSJ reads from genome alignment file Traceback (most recent call last): File "/home/justin/miniconda2/bin/CIRIquant", line 10, in sys.exit(main()) File "/home/justin/miniconda2/lib/python2.7/site-packages/CIRIquant/main.py", line 183, in main out_file = circ.proc(log_file, thread, bed_file, hisat_bam, rnaser_file, reads, outdir, prefix, anchor, lib_type) File "/home/justin/miniconda2/lib/python2.7/site-packages/CIRIquant/circ.py", line 656, in proc bsj_reads, fsj_reads = proc_genome_bam(hisat_bam, thread, circ_info, cand_bsj, anchor, circ_dir) File "/home/justin/miniconda2/lib/python2.7/site-packages/CIRIquant/circ.py", line 434, in proc_genome_bam tmp = job.get() File "/home/justin/miniconda2/lib/python2.7/multiprocessing/pool.py", line 572, in get raise self._value MemoryError

How can I sovle this problem?

Kevinzjy commented 2 years ago

Hi @Biocard , it might be caused by an enormous number of aligned reads when calculating the number of FSJ reads. Is the number of FSJ reads matters for your analysis? Maybe I could add an option to disable the FSJ counting feature.

Biocard commented 2 years ago

Thanks for your help! I only need to get the final GTF file for quantitative analysis, so FSJ count does not matters for my analysis.

Biocard commented 2 years ago

Hi @Kevinzjy , how could I add an option to disable the FSJ counting feature? It takes too long to count.

Kevinzjy commented 2 years ago

The bioinfo-biols/CIRIquant@b29976193ddd2205281ac4b2880fa19082bae67b include the "--no-fsj" option to disable FSJ counting. Could you give it a try? You will have to clone the CIRIquant repository and install it from source. If it works fine, I'll add it to the next released version.

Biocard commented 2 years ago

Hi @Kevinzjy ! Thanks for your help! My CIRIquant now generates GTF files properly !

[Fri 2022-06-24 08:50:11] [INFO ] Input reads: SRR98561_1_val_1.fq.gz,SRR98561_2_val_2.fq.gz [Fri 2022-06-24 08:50:11] [INFO ] Library type: unstranded [Fri 2022-06-24 08:50:11] [INFO ] Output directory: /home/biodata/GSE135055_CIRCRNA/SRR98561_out, Output prefix: SRR9856191 [Fri 2022-06-24 08:50:11] [INFO ] Config: GRCh38 Loaded [Fri 2022-06-24 08:50:11] [INFO ] 40 CPU cores availble, using 35 [Fri 2022-06-24 08:50:11] [INFO ] Align RNA-seq reads to reference genome .. [Fri 2022-06-24 09:39:03] [INFO ] Estimate gene abundance .. [Fri 2022-06-24 09:47:26] [INFO ] No circRNA information provided, run CIRI2 for junction site prediction .. [Fri 2022-06-24 09:47:26] [INFO ] Running BWA-mem mapping candidate reads .. [Fri 2022-06-24 10:00:24] [INFO ] Running CIRI2 for circRNA detection .. [Fri 2022-06-24 10:30:09] [INFO ] Skipping FSJ reads extraction [Fri 2022-06-24 10:30:09] [INFO ] Extract circular sequence [Fri 2022-06-24 10:30:23] [100% ] [##################################################] [Fri 2022-06-24 10:30:23] [INFO ] Building circular index .. [Fri 2022-06-24 10:32:36] [INFO ] De novo alignment for circular RNAs .. [Fri 2022-06-24 11:16:01] [INFO ] Detecting reads containing Back-splicing signals [Fri 2022-06-24 11:16:29] [INFO ] Detecting FSJ reads from genome alignment file [Fri 2022-06-24 11:23:45] [INFO ] Merge bsj and fsj results [Fri 2022-06-24 11:23:45] [INFO ] Loading annotation gtf .. [Fri 2022-06-24 11:24:00] [INFO ] Output circRNA expression values [Fri 2022-06-24 11:24:18] [INFO ] circRNA Expression profile: SRR9856191.gtf [Fri 2022-06-24 11:24:18] [INFO ] Finished!

Kevinzjy commented 2 years ago

That's great!