bioinfo-biols / CIRIquant

circular RNA quantification tools
https://sourceforge.net/projects/ciri/files/CIRIquant
MIT License
27 stars 18 forks source link

Issue with using find_circ prediction results for CIRIquant #30

Open prisca399 opened 3 years ago

prisca399 commented 3 years ago

I have predicted circRNAs in my samples using find_circ and am using the CIRIquant command to requantify the predicted circRNAs. This has been mostly successful insofar, however, lately I have been running into a particular error for certain sample files that looks like this:

[Tue 2021-06-15 06:03:18] [INFO ] De novo alignment for circular RNAs .. Error reading _rstarts[] array: 592600, 596640 Error: Encountered internal HISAT2 exception (#1) Command: /gpfs/ycga/project/ycga/ygc/pio2/conda_envs/ciriquant/bin/hisat2-align-l --wrapper basic-0 -p 20 --dta -q -x /home/pio2/scratch60/2021_sharma/data/tmp/GC_tmp/CIRIquant_findcirc_quant/circ/103658-001-165_find_circ_index --read-lengths 151,150,136,141,138,133,143,135,139,137,140,131,127,122,134,125,142,132,128,120,116,126,130,119,123,118,117,114,113,111,129,112,108,124,121,106,109,110,149,115,105,107,104,102,100,101,103,95,81,97,90,71,92,99,98,96,94,88,89,87,61,63,31,62,58,56,47,46,41,32,86,80,68,35,147,144,93,84,82,69,67,66,51,45,39,38,34,33,148,91,79,77,54,53,30,145,85,75,57,37,36,83,76,74,59,52,43,40,70,65,64,60,49,42,146,78,73,72,50,48,44,55 -1 /tmp/6579.inpipe1 -2 /tmp/6579.inpipe2 (ERR): hisat2-align exited with value 1

gzip: stdout: Broken pipe

gzip: stdout: Broken pipe [Tue 2021-06-15 06:03:24] [INFO ] Detecting reads containing Back-splicing signals

This is the command I am running:

CIRIquant -t 20 -1 103658-001-165_merged_R1_trimmed_paired.fastq.gz -2 103658-001-165_merged_R2_trimmed_paired.fastq.gz --config sharma_ciriquant_config.yml -o CIRIquant_findcirc_quant -p 103658-001-165_find_circ --circ findcirc_final_output/103658-001-165_CIRCULAR_splice_sites.bed -e CIRIquant_findcirc_quant/logs/103658-001-165_find_circ.log -l 2 --tool find_circ --bam hisat_bam_files/103658-001-165_hisat_align.bam

Do you have any insights into why this error is occurring and what I may do to address it? I can confirm that I was able to run find_circ on this data set (to generate the CIRCULAR_splice_sites.bed file) without error.

Thank you!

Kevinzjy commented 3 years ago

Hi @prisca399 , could you check the max distance between the end and start coordinate of input BSJ junctions ?

I noticed that hisat2 died with Error reading _rstarts[] array: 592600, 596640, which is an abnormally far distance for a circRNA loci. So maybe filtering out circRNAs larger than 200,000 bp (as CIRI2 suggested) could resolve this error.