ndaniel / fusioncatcher

Finder of Somatic Fusion Genes in RNA-seq data
GNU General Public License v3.0
141 stars 67 forks source link

Number of junctions to be inserted on the fly is larger than the limitSjdbInsertNsj=2000000 #132

Closed heathergeiger closed 5 years ago

heathergeiger commented 5 years ago

Getting the following error for a few of my samples.

Running FusionCatcher installed from the conda instructions, since had trouble getting the Ensembl data part to be read properly using the normal installation. So version is 1.00 not 1.10.

ERROR: Workflow execution failed at step 402 while executing:

   STAR \
   --twopass1readsN -1 \
   --twopassMode Basic \
   --genomeSAindexNbases 14 \
   --sjdbOverhang 91 \
   --alignIntronMax 2486488 \
   --outFilterMatchNmin 16 \
   --outFilterMatchNminOverLread 0.2667 \
   --outFilterScoreMinOverLread 0.2667 \
   --alignSplicedMateMapLminOverLmate 0.2667 \
   --genomeDir /gpfs/internal/analysis/NYGC/Project_NYGC_14113_B01_AnalysisOnly/Sample_30/compbio/analysis/fusioncatcher_anaconda/gene-gene_split_star.fa.0_star/ \
   --runThreadN 4 \
   --seedSearchStartLmax 16 \
   --alignSJoverhangMin 16 \
   --alignSJstitchMismatchNmax 5 -1 5 5 \
   --outSJfilterOverhangMin 10 10 10 10 \
   --outSJfilterCountUniqueMin 1 1 1 1 \
   --outSJfilterCountTotalMin 1 1 1 1 \
   --outSJfilterDistToOtherSJmin 0 0 0 0 \
   --outSJfilterIntronMaxVsReadN 2486488 2486488 2486488 \
   --limitOutSAMoneReadBytes 100000000 \
   --scoreGapNoncan -4 \
   --scoreGapATAC -4 \
   --limitSjdbInsertNsj 2000000 \
   --readFilesIn /gpfs/internal/analysis/NYGC/Project_NYGC_14113_B01_AnalysisOnly/Sample_30/compbio/analysis/fusioncatcher_anaconda/reads_gene-gene_no-str_fixed.fq \
   --outFileNamePrefix /gpfs/internal/analysis/NYGC/Project_NYGC_14113_B01_AnalysisOnly/Sample_30/compbio/analysis/fusioncatcher_anaconda/gene-gene_split_star.fa.0_star-results/

Executing second time the same step/command in order to capture error messages (i.e. STDERR)...

Fatal LIMIT error: the number of junctions to be inserted on the fly =2071179 is larger than the limitSjdbInsertNsj=2000000
Fatal LIMIT error: the number of junctions to be inserted on the fly =2071179 is larger than the limitSjdbInsertNsj=2000000
SOLUTION: re-run with at least --limitSjdbInsertNsj 2071179

Aug 07 18:33:54 ...... FATAL ERROR, exiting

Other info:

Directory gene-gene_split_star.fa.0_star contains chrLength.txt, etc. as well as SA and SAindex. SA and SAindex are each over 1G in size.

Input FASTQ size to this step (reads_gene-gene_no-str_fixed.fq) is 1.6G.

Looking at gene-gene_split_star.fa.0_star-results/Log.progress.out, it includes message "Finished 1st pass mapping". So failure must have occurred at the 2nd pass step.

heathergeiger commented 5 years ago

For the record I had two other samples that failed this way, just with different numbers for "the number of junctions to be inserted on the fly =" instead of 2071179. But numbers were never more than 2500000. If the limitSjdbInsertNsj parameter is manually hard-coded (which it seems like it must be, since default is only 1,000,000 not 2,000,000), it seems would need to increase this only slightly to accommodate.

heathergeiger commented 5 years ago

I see this was solved in 1.10 (https://github.com/ndaniel/fusioncatcher/issues/111). Closing this issue.