Open esrice opened 3 weeks ago
Hi,
thanks for reporting.
Have validated this by running gzip -cdf 3M-february-2018.txt
on your file manually? Because the -f
flag of gzip should already deal with non-compressed files.
Oh, weird. As you predicted, running that command manually works just fine. So I don't understand why the same command appears to fail inside the pipeline leaving it with an empty whitelist, or why my attempted fix (see PR) of only running gzip if the filename ends in ".gz" prevents this from happening. Do you have any ideas?
Can you try running it inside the cellranger container? Maybe it has a different version of gzip...
Ah yup that's the problem:
$ gzip -cdf /mnt/pixstor/data/esrbhb/3M-february-2018.txt # this works
$ singularity exec -B /mnt https://depot.galaxyproject.org/singularity/star:2.7.10b--h9ee0642_0 gzip -cdf /mnt/pixstor/data/esrbhb/3M-february-2018.txt
gzip: invalid magic
My system gzip is v1.9 but the container gzip is BusyBox v1.32.1.
ok, then your PR should fix this. Many thanks for checking!
Description of the bug
I specified an un-gzipped whitelist file with the
--barcode_whitelist
parameter. In the STAR_ALIGN step, it tries to unzip this file, which causes gzip to crash, which causes the step to fail. This is the offending line of.command.sh
:I will try to fix and submit a PR in the next day or two.
Command used and terminal output
$ nextflow run nf-core/scrnaseq \ -profile singularity \ --input ../samples.csv \ --fasta: ../../ref/bGalGal1b_modified.fa \ --gtf: ../../ref/bGalGal1b_modified_filtered.gtf \ --protocol 10XV3 \ --aligner star \ --outdir out \ --barcode_whitelist /mnt/pixstor/data/esrbhb/3M-february-2018.txt \ --save_reference
ERROR ~ Error executing process > 'NFCORE_SCRNASEQ:SCRNASEQ:STARSOLO:STAR_ALIGN (D2)'
Caused by: Process
NFCORE_SCRNASEQ:SCRNASEQ:STARSOLO:STAR_ALIGN (D2)
terminated with an error exit status (104)Command executed:
STAR \ --genomeDir star \ --readFilesIn D2_S2_L001_R2_001.fastq.gz D2_S2_L001_R1_001.fastq.gz \ --runThreadN 16 \ --outFileNamePrefix D2. \ --soloCBwhitelist <(gzip -cdf 3M-february-2018.txt) \ --soloType CB_UMI_Simple \ --soloFeatures Gene \ --soloUMIlen 12 \ \ --sjdbGTFfile bGalGal1b_modified_genes.gtf \ --outSAMattrRGline ID:D2 'SM:D2' \ \ --readFilesCommand zcat --runDirPerm All_RWX --outWigType bedGraph --twopassMode Basic --outSAMtype BAM SortedByCoordinate \
if [ -f D2.Unmapped.out.mate1 ]; then mv D2.Unmapped.out.mate1 D2.unmapped_1.fastq gzip D2.unmapped_1.fastq fi if [ -f D2.Unmapped.out.mate2 ]; then mv D2.Unmapped.out.mate2 D2.unmapped_2.fastq gzip D2.unmapped_2.fastq fi
if [ -d D2.Solo.out ]; then
Backslashes still need to be escaped (https://github.com/nextflow-io/nextflow/issues/67)
fi
cat <<-END_VERSIONS > versions.yml "NFCORE_SCRNASEQ:SCRNASEQ:STARSOLO:STARALIGN": star: $(STAR --version | sed -e "s/STAR//g") END_VERSIONS
Command exit status: 104
Command output: (empty)
Command error: INFO: Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred INFO: Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred INFO: Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred gzip: invalid magic
EXITING because of FATAL ERROR: CB whitelist file /dev/fd/63 is empty. SOLUTION: provide non-empty whitelist.
Oct 11 07:18:04 ...... FATAL ERROR, exiting
Work dir: /mnt/pixstor/warrenwc-lab/users/edward/nxf_work/1a/6d24d1b3b8d570f7e134a16d877d51
Tip: you can replicate the issue by changing to the process work dir and entering the command
bash .command.run
-- Check '.nextflow.log' file for details ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting
-- Check '.nextflow.log' file for details -[nf-core/scrnaseq] Pipeline completed with errors- WARN: Killing running tasks (1)
Relevant files
No response
System information