sdparekh / zUMIs

zUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
GNU General Public License v3.0
268 stars 67 forks source link

Error in alldt[[i]][[2]] : subscript out of bounds #377

Open MohamedAbdalfatah opened 8 months ago

MohamedAbdalfatah commented 8 months ago

Describe the bug Hi, I'm running ZUMIs for smart-seq data, this is not first time to work with it, I had two projects before and I'm using same scripts I got an erorr it seems to be an R erorr after mappign and quantfication and I don't know how to debug or solve it

This is the log.error file

The following have been reloaded with a version change:
  1) GCC/10.2.0 => GCC/11.2.0
  2) GCCcore/10.2.0 => GCCcore/11.2.0
  3) HDF5/1.10.7-gompi-2020b => HDF5/1.12.1-gompi-2021b
  4) OpenMPI/4.0.5-GCC-10.2.0 => OpenMPI/4.1.1-GCC-11.2.0
  5) PMIx/3.1.5-GCCcore-10.2.0 => PMIx/4.1.0-GCCcore-11.2.0
  6) Szip/2.1.1-GCCcore-10.2.0 => Szip/2.1.1-GCCcore-11.2.0
  7) UCX/1.9.0-GCCcore-10.2.0 => UCX/1.11.2-GCCcore-11.2.0
  8) XZ/5.2.5-GCCcore-10.2.0 => XZ/5.2.5-GCCcore-11.2.0
  9) binutils/2.35-GCCcore-10.2.0 => binutils/2.37-GCCcore-11.2.0
 10) gompi/2020b => gompi/2021b
 11) hwloc/2.2.0-GCCcore-10.2.0 => hwloc/2.5.0-GCCcore-11.2.0
 12) libevent/2.1.12-GCCcore-10.2.0 => libevent/2.1.12-GCCcore-11.2.0
 13) libpciaccess/0.16-GCCcore-10.2.0 => libpciaccess/0.16-GCCcore-11.2.0
 14) libxml2/2.9.10-GCCcore-10.2.0 => libxml2/2.9.10-GCCcore-11.2.0
 15) numactl/2.0.13-GCCcore-10.2.0 => numactl/2.0.14-GCCcore-11.2.0
 16) zlib/1.2.11-GCCcore-10.2.0 => zlib/1.2.11-GCCcore-11.2.0

Warning message:
`as_quosure()` requires an explicit environment as of rlang 0.3.0.
Please supply `env`.
This warning is displayed once per session.
[bam_sort_core] merging from 120 files and 20 in-memory blocks...
Error in alldt[[i]][[2]] : subscript out of bounds
Calls: bindList
In addition: Warning message:
In parallel::mclapply(mapList, function(tt) { :
  scheduled core 1 did not deliver a result, all values of the job will be affected
Execution halted
Loading required package: yaml
Loading required package: Matrix
Error in gzfile(file, "rb") : cannot open the connection
Calls: rds_to_loom -> readRDS -> gzfile
In addition: Warning message:
In gzfile(file, "rb") :
  cannot open compressed file 'outs/zUMIs_output/expression/SCMARATOCOV_01.dgecounts.rds', probable reason 'No such file or directory'
Execution halted
Error in gzfile(file, "rb") : cannot open the connection
Calls: readRDS -> gzfile
In addition: Warning message:
In gzfile(file, "rb") :
  cannot open compressed file 'outs/zUMIs_output/expression/SCMARATOCOV_01.dgecounts.rds', probable reason 'No such file or directory'
Execution halted

This is the log output file

Using miniconda environment for zUMIs!
 note: internal executables will be used instead of those specified in the YAML file!

 You provided these parameters:
 YAML file:    zUMIs_SMARTseq2_generated.yaml
 zUMIs directory:               software/zUMIs
 STAR executable                STAR
 samtools executable            samtools
 pigz executable                pigz
 Rscript executable             Rscript
 RAM limit:   0
 zUMIs version 2.9.7e

Sat Oct 21 13:03:04 CEST 2023

Filtering...
Sat Oct 21 14:56:23 CEST 2023
[1] " reads were assigned to barcodes that do not correspond to intact cells."
Mapping...
[1] "2023-10-21 14:56:26 CEST"
Oct 21 14:56:34 ..... started STAR run
Oct 21 14:56:36 ..... loading genome
Oct 21 14:56:34 ..... started STAR run
Oct 21 14:56:36 ..... loading genome
Oct 21 14:56:34 ..... started STAR run
Oct 21 14:56:36 ..... loading genome
Oct 21 14:58:05 ..... processing annotations GTF
Oct 21 14:58:05 ..... processing annotations GTF
Oct 21 14:58:05 ..... processing annotations GTF
Oct 21 14:58:23 ..... inserting junctions into the genome indices
Oct 21 14:58:24 ..... inserting junctions into the genome indices
Oct 21 14:58:24 ..... inserting junctions into the genome indices
Oct 21 15:05:26 ..... started 1st pass mapping
Oct 21 15:05:26 ..... started 1st pass mapping
Oct 21 15:05:27 ..... started 1st pass mapping
Oct 21 16:32:24 ..... finished 1st pass mapping
Oct 21 16:32:24 ..... inserting junctions into the genome indices
Oct 21 16:34:13 ..... started mapping
Oct 21 16:53:20 ..... finished 1st pass mapping
Oct 21 16:53:21 ..... inserting junctions into the genome indices
Oct 21 16:55:07 ..... started mapping
Oct 21 17:09:04 ..... finished 1st pass mapping
Oct 21 17:09:05 ..... inserting junctions into the genome indices
Oct 21 17:10:59 ..... started mapping
Oct 21 18:59:17 ..... finished mapping
Oct 21 18:59:18 ..... finished successfully
Oct 21 19:33:59 ..... finished mapping
Oct 21 19:34:01 ..... finished successfully
Oct 21 19:52:23 ..... finished mapping
Oct 21 19:52:24 ..... finished successfully
Sat Oct 21 19:58:56 CEST 2023
Counting...
[1] "2023-10-21 19:59:08 CEST"
[1] "4.5e+08 Reads per chunk"
[1] "Loading reference annotation from:"
[1] "outs/SCMARATOCOV_01.final_annot.gtf"
[1] "Annotation loaded!"
[1] "Assigning reads to features (ex)"

        ==========     _____ _    _ ____  _____  ______          _____
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 1.32.4

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                           P SCMARATOCOV_01.filtered.tagged.Aligned.out ... ||
||                                                                            ||
||              Annotation : R data.frame                                     ||
||      Assignment details : <input_file>.featureCounts.bam                   ||
||                      (Note that files are saved to the output directory)   ||
||                                                                            ||
||      Dir for temp files : .                                                ||
||                 Threads : 20                                               ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||      Multimapping reads : counted                                          ||
||     Multiple alignments : primary alignment only                           ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : not counted                                      ||
||        Both ends mapped : not required                                     ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file .Rsubread_UserProvidedAnnotation_pid179832 ...        ||
||    Features : 358361                                                       ||
||    Meta-features : 62710                                                   ||
||    Chromosomes/contigs : 47                                                ||
||                                                                            ||
|| Process BAM file SCMARATOCOV_01.filtered.tagged.Aligned.out.bam...         ||
||    Paired-end reads are included.                                          ||
||    Assign alignments (paired-end) to features...                           ||
||    Total alignments : 921485000                                            ||
||    Successfully assigned alignments : 539338430 (58.5%)                    ||
||    Running time : 8.22 minutes                                             ||
||                                                                            ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

[1] "Assigning reads to features (in)"

        ==========     _____ _    _ ____  _____  ______          _____
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 1.32.4

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                           P SCMARATOCOV_01.filtered.tagged.Aligned.out ... ||
||                                                                            ||
||              Annotation : R data.frame                                     ||
||      Assignment details : <input_file>.featureCounts.bam                   ||
||                      (Note that files are saved to the output directory)   ||
||                                                                            ||
||      Dir for temp files : .                                                ||
||                 Threads : 20                                               ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||      Multimapping reads : counted                                          ||
||     Multiple alignments : primary alignment only                           ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : not counted                                      ||
||        Both ends mapped : not required                                     ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file .Rsubread_UserProvidedAnnotation_pid179832 ...        ||
||    Features : 238917                                                       ||
||    Meta-features : 28537                                                   ||
||    Chromosomes/contigs : 33                                                ||
||                                                                            ||
|| Process BAM file SCMARATOCOV_01.filtered.tagged.Aligned.out.bam.ex.fea ... ||
||    Paired-end reads are included.                                          ||
||    Assign alignments (paired-end) to features...                           ||
||    Total alignments : 921485000                                            ||
||    Successfully assigned alignments : 100342923 (10.9%)                    ||
||    Running time : 8.32 minutes                                             ||
||                                                                            ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

[1] "2023-10-21 20:17:12 CEST"
[1] "Coordinate sorting final bam file..."
[1] "2023-10-21 20:53:48 CEST"
[1] "Here are the detected subsampling options:"
[1] "Automatic downsampling"
[1] "Working on barcode chunk 1 out of 3"
[1] "Processing 85 barcodes in this chunk..."
[1] "Working on barcode chunk 2 out of 3"
[1] "Processing 279 barcodes in this chunk..."
Sat Oct 21 21:16:33 CEST 2023
[1] "loomR found"
Sat Oct 21 21:16:36 CEST 2023
Descriptive statistics...
[1] "I am loading useful packages for plotting..."
[1] "2023-10-21 21:16:36 CEST"
Sat Oct 21 21:16:43 CEST 2023

To Reproduce

This is Ymal File

project: SCMARATOCOV_01 sequence_files: file1: name: /fastq_dir/combined_fastqs/reads_for_zUMIs.R1.fastq.gz base_definition: cDNA(1-100) file2: name: fastq_dir/combined_fastqs/reads_for_zUMIs.R2.fastq.gz base_definition: cDNA(1-100) file3: name: fastq_dir/combined_fastqs/reads_for_zUMIs.index.fastq.gz base_definition: BC(1-8) reference: STAR_index: references/human_star_ref GTF_file: references/Homo_sapiens.GRCh38.109.gtf additional_STAR_params: '' additional_files: ~ out_dir: /outs num_threads: 20 mem_limit: 0 filter_cutoffs: BC_filter: num_bases: 1 phred: 20 UMI_filter: num_bases: 1 phred: 20 barcodes: barcode_num: ~ barcode_file: fastq_dir/combined_fastqs/reads_for_zUMIs.expected_barcodes.txt automatic: no BarcodeBinning: 0 nReadsperCell: 100 counting_opts: introns: yes downsampling: '0' strand: 0 Ham_Dist: 0 velocyto: no primaryHit: yes twoPass: yes make_stats: yes which_Stage: Filtering Rscript_exec: Rscript STAR_exec: STAR pigz_exec: pigz samtools_exec: samtools

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Please can you help me with this? Best Mo

cziegenhain commented 8 months ago

Hi Mo,

Sorry to hear that you are encountering issues! I am not 100% sure where this one comes from to be honest, however I have seen some odd interactions of R dependencies within slurm cluster runs. Anyway the precise error seems to relate to some unexpected shape of the data when trying to combine count output from several of the chunks, you could modify your RAM limit to force zUMIs to chunk up the data in a different way and see how that goes! eg. mem_limit: 50

MohamedAbdalfatah commented 8 months ago

Thank You, it is working mem_limit: 50 Best

kvn95ss commented 5 months ago

Just curious, what was the fastq.gz file size?