ncbi / sra-tools

SRA Tools
Other
1.11k stars 243 forks source link

disk-limit exeeded! fasterq-dump quit with error code 3 #921

Closed brycemash closed 6 months ago

brycemash commented 6 months ago

I have an error saying disk space is exceeded, but i have plenty of space where i set temp and the cwd. Any help debugging this one would be greatly appreciated!

-bash:uger-r7-c001:/broad/dunnlab 1006 $ df -h Filesystem Size Used Avail Use% Mounted on sodium-nfs:/ifs/data/broad/dunnlab 10T 8.1T 2.0T 81% /broad/dunnlab

(base) -bash:login04:~ 1118 $ cat SR62.e42628060
2024-03-26T04:20:16 prefetch.3.0.1: Current preference is set to retrieve SRA Normalized Format files with full base quality scores.
2024-03-26T04:20:16 prefetch.3.0.1: 1) Downloading 'SRR15808962'...
2024-03-26T04:20:16 prefetch.3.0.1: SRA Normalized Format file is being retrieved, if this is different from your preference, it may be due to current file availability.
2024-03-26T04:20:16 prefetch.3.0.1:  Downloading via HTTPS...
2024-03-26T04:42:20 prefetch.3.0.1:  HTTPS download succeed
2024-03-26T04:42:20 prefetch.3.0.1:   verifying 'SRR15808962'...
2024-03-26T04:44:48 prefetch.3.0.1:  'SRR15808962' is valid
2024-03-26T04:44:48 prefetch.3.0.1: 1) 'SRR15808962' was downloaded successfully
2024-03-26T04:44:48 prefetch.3.0.1: 'SRR15808962' has 0 unresolved dependencies
cursor-cache : 5,242,880 bytes
buf-size     : 1,048,576 bytes
mem-limit    : 52,428,800 bytes
threads      : 3
scratch-path : '/broad/dunnlab/BLM/meningioma_scRNA/Raleigh/fastq/SRR15808962/tmp/fasterq.tmp.ugerbm-c003.broadinstitute.org.32430/'
total ram    : 3,246,062,792,704 bytes
output-format: FASTQ split file
check-mode   : on
output-file  : '/broad/dunnlab/BLM/meningioma_scRNA/Raleigh/fastq/SRR15808962/SRR15808962.fastq'
output-dir   : '/broad/dunnlab/BLM/meningioma_scRNA/Raleigh/fastq/SRR15808962'
output       : '/broad/dunnlab/BLM/meningioma_scRNA/Raleigh/fastq/SRR15808962/SRR15808962.fastq'
append-mode  : 'NO'
stdout-mode  : 'NO'
seq-defline  : '@$ac.$si $sn length=$rl'
qual-defline  : '+$ac.$si $sn length=$rl'
only-unaligned : 'NO'
only-aligned   : 'NO'
accession     : 'SRR15808962'
accession-path: 'SRR15808962'
est. output          : 357,418,972,728 bytes
disk-limit input     : 250 bytes
disk-limit (OS)      : 2,100,904,853,504 bytes
disk-limit-tmp (OS)  : 2,100,904,853,504 bytes
out/tmp on same fs   : 'NO'

SRR15808962 is local
... has a size of 39,588,576,966 bytes
... is cSRA without alignments
... SEQ has NAME column = YES
... SEQ has SPOT_GROUP column = YES
... uses 'SEQUENCE' as sequence-table
SEQ.first_row = 1
SEQ.row_count = 337,825,116
SEQ.spot_count = 337,825,116
SEQ.total_base_count = 104,725,785,960
SEQ.bio_base_count = 51,011,592,516
SEQ.avg_name_len = 37
SEQ.avg_spot_group_len = 8
SEQ.avg_bio_reads_per_spot = 1
SEQ.avg_tech_reads_per_spot = 2
ALIGN.first_row = 0
ALIGN.row_count = 0
ALIGN.spot_count = 0
ALIGN.total_base_count = 0
ALIGN.bio_base_count = 0

disk-limit exeeded!
fasterq-dump quit with error code 3
# /broad/dunnlab/BLM/bash_scripts/SRR15808961_fasterq.sh
#!/bin/bash
cd /broad/dunnlab/BLM/meningioma_scRNA/Raleigh/fastq
mkdir SRR15808961 
cd SRR15808961

apptainer exec -B /broad/dunnlab/ \
      /broad/dunnlab/Docker_Container_Location/sra-tools.simg \
      prefetch SRR15808961 -X 9999999999999 -O /broad/dunnlab/BLM/meningioma_scRNA/Raleigh/fastq/SRR15808961

cd /broad/dunnlab/BLM/meningioma_scRNA/Raleigh/fastq/SRR15808961

apptainer exec -B /broad/dunnlab/ \
        /broad/dunnlab/Docker_Container_Location/sra-tools.simg \
        fasterq-dump \
        --threads 3 \
        --progress \
        --split-files \
        -t /broad/dunnlab/BLM/meningioma_scRNA/Raleigh/fastq/SRR15808961/tmp \
        -O /broad/dunnlab/BLM/meningioma_scRNA/Raleigh/fastq/SRR15808961  \
        --include-technical \
        --disk-limit 250GB \
        -x \
        SRR15808961

### to run the above .sh file ###
chmod +x /broad/dunnlab/BLM/bash_scripts/SRR15808961_fasterq.sh

qsub -N "SR61" \
  -l h_vmem=150G \
  -l h_rt=30:00:00 \
  -pe smp 3 \
  -binding linear:3 \
  "/broad/dunnlab/BLM/bash_scripts/SRR15808961_fasterq.sh"
wraetz commented 6 months ago

The problem stems from your disk-limit "--disk-limit 250GB". The tool interprets it as 250 bytes, as can be seen in your first captured output: "disk-limit input : 250 bytes" Just omit it. It looks like the tool picks up the limits of your real drives correctly: disk-limit (OS) : 2,100,904,853,504 bytes disk-limit-tmp (OS) : 2,100,904,853,504 bytes If these are not the correct numbers: turn off the size-check with "--size-check off"

brycemash commented 6 months ago

Thank you!