sdparekh / zUMIs

zUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
GNU General Public License v3.0
268 stars 67 forks source link

BUG: segmentation fault during STAR mapping #363

Open LJK1991 opened 1 year ago

LJK1991 commented 1 year ago

Describe the bug I am trying to rum zUMI's on a large cluster with SLURM. They do not allow anaconda so i cannot use the in-build conda env. Instead we can install required software in our $HOME folder and load software modules available on the server. I am running on a small dataset i have used before and am now using as test-data because it is a small dataset (SRR6750041). When running STAR seems to get a segmentation fault during the mapping stage, and i am not sure why? Did you see this behaviour before? I cannot find anything similar in on STAR github I can only find a Log.final.out in zUMI's output which is an empty file. There does not seem to be another STAR Logfile. One other strange behaviour is that the genome is generated with the same STAR that is loaded, so i am not sure why the STARgenome version is not the same warning pops up.

Kind regards, Lucas

To Reproduce The sbatch.sh

#!/bin/bash
#Set job requirements
#SBATCH -n 1
#SBATCH -t 2:00:00
#SBATCH -p thin
#SBATCH --mem=64G
#SBATCH --job-name=zUMI
#SBATCH --cpus-per-task=32

module load 2022
module load STAR/2.7.10a-GCC-11.3.0
module load SAMtools/1.15.1-GCC-11.3.0
module load HarfBuzz/4.2.1-GCCcore-11.3.0
module load FriBidi/1.0.12-GCCcore-11.3.0
module load R/4.2.1-foss-2022a

PROJECT="/gpfs/work4/0/emse0442/"

srun $HOME/scripts/zUMI.sh -y $HOME/metafiles/Rosenberg_zUMI.yaml

the YAML

project: Rosenberg_small
sequence_files:
  file1:
    name: /gpfs/work4/0/emse0442/data/Rosenberg/newfiles/SRR6750041_1.fastq.gz
    base_definition: cDNA(1-66)
  file2:
    name: /gpfs/work4/0/emse0442/data/Rosenberg/newfiles/SRR6750041_2.fastq.gz
    base_definition:
    - BC(11-18,49-56,87-94)
    - UMI(1-10)
reference:
  STAR_index: /gpfs/work4/0/emse0442/data/genomes/mouse/GRCm39_zUMI
  GTF_file: /gpfs/work4/0/emse0442/data/genomes/mouse/GRCm39/genomic.gtf
  additional_STAR_params: ''
  additional_files: ~
out_dir: /gpfs/work4/0/emse0442/data/Rosenberg/zUMI/test/
num_threads: 32
mem_limit: 64
filter_cutoffs:
  BC_filter:
    num_bases: 2
    phred: 20
  UMI_filter:
    num_bases: 1
    phred: 20
barcodes:
  barcode_num: 100
  barcode_file: /gpfs/home3/lucask/metafiles/allBC_Whitelist.txt
  automatic: no
  BarcodeBinning: 0
  nReadsperCell: 100
counting_opts:
  introns: yes
  downsampling: '0'
  strand: 0
  Ham_Dist: 0
  velocyto: no
  primaryHit: yes
  twoPass: no
make_stats: yes
which_Stage: Filtering
Rscript_exec: Rscript
STAR_exec: STAR
pigz_exec: pigz
samtools_exec: samtools

Screenshots The stdout and error is readable here slurm-2742904.txt

Desktop (please complete the following information):

NAME="Rocky Linux"
VERSION="8.7 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.7"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.7 (Green Obsidian)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-8"
ROCKY_SUPPORT_PRODUCT_VERSION="8.7"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.7"
LSB Version:    :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch
Distributor ID: Rocky
Description:    Rocky Linux release 8.7 (Green Obsidian)
Release:        8.7
Codename:       GreenObsidian

Additional context Add any other context about the problem here.