nf-core / scrnaseq

A single-cell RNAseq pipeline for 10X genomics data
https://nf-co.re/scrnaseq
MIT License
209 stars 166 forks source link

2-pass STAR alignment needs to increase limitSjdbInsertNsj #176

Closed tomsing1 closed 1 year ago

tomsing1 commented 1 year ago

Description of the bug

I was running a workflow on snRNA-seq data with STARsolo as the aligner and the human reference genome (Gencode v42) as the index. It seems that the second pass of the alignment fails because the limitSjdbInsertNsj option needs to be increased (see below). Is it possible to pass arguments through to the STAR aligner so I can modify this setting??

Command error:
    STAR --genomeDir star --readFilesIn 190409-B5-A_Broad_S1_L003_R2_001.fastq.gz,190409-B5-A_Broad_S1_L004_R2_001.fastq.gz 190409-B5-A_Broad_S1_L003_R1_001.fastq.gz,190409-B5-A_Broad_S1_L004_R1_001.fastq.gz --runThreadN 16 --outFileNamePrefix 190409-B5-A_Broad. --soloCBwhitelist /dev/fd/63 --soloType CB_UMI_Simple --soloUMIlen 12 --sjdbGTFfile GRCh38.primary_assembly.genome_genes.gtf --outSAMattrRGline ID:190409-B5-A_Broad SM:190409-B5-A_Broad --readFilesCommand zcat --runDirPerm All_RWX --outWigType bedGraph --twopassMode Basic --outSAMtype BAM SortedByCoordinate
    STAR version: 2.7.10a   compiled: 2022-01-14T18:50:00-05:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
  Nov 02 05:57:38 ..... started STAR run
  Nov 02 05:57:40 ..... loading genome
  Nov 02 05:58:49 ..... processing annotations GTF
  Nov 02 05:59:11 ..... inserting junctions into the genome indices
  Nov 02 06:00:47 ..... started 1st pass mapping
  Nov 02 07:09:10 ..... finished 1st pass mapping
  Fatal LIMIT error: the number of junctions to be inserted on the fly =1590539 is larger than the limitSjdbInsertNsj=1000000
  Fatal LIMIT error: the number of junctions to be inserted on the fly =1590539 is larger than the limitSjdbInsertNsj=1000000
  SOLUTION: re-run with at least --limitSjdbInsertNsj 1590539
  Nov 02 07:09:14 ...... FATAL ERROR, exiting

Command used and terminal output

No response

Relevant files

execution_log.txt

System information

No response

tomsing1 commented 1 year ago

Mmh, I think this could be dealt with via a custom config file, as outlined in the rnaseq workflow here. Is that the way to go?

apeltzer commented 1 year ago

Yes, that would work and should be the preferred way (at least for now). If the option is something that is more frequently accessed / required, we can also add it as a "regular" option that users can specify.

tomsing1 commented 1 year ago

Many thanks for the quick confirmation, @apeltzer