nf-core / scrnaseq

A single-cell RNAseq pipeline for 10X genomics data
https://nf-co.re/scrnaseq
MIT License
213 stars 170 forks source link

Samplesheet error in v2.2.0 when more than 3 fields are present #211

Closed davidecarlson closed 1 year ago

davidecarlson commented 1 year ago

Description of the bug

In v2.2.0 of the pipeline, the INPUT_CHECK:SAMPLESHEET_CHECK step will fail if the sample sheet has more than three fields (i.e., more than just the sample, fastq_1, and fastq_2 fields).

The same samplesheet works successfully with v2.1.0. In addition, if I run cut -d "," -f1,2,3" to create a samplesheet with only the first 3 fields and use this with v2.2.0, the INPUT_CHECK:SAMPLESHEET_CHECK step completes successfully.

Command used and terminal output

nextflow run nf-core/scrnaseq -r 2.2.0 -profile seawulf --input /gpfs/projects/GenomicsCore/fastqs/Dada-03-23/Micheli02020/samplesheet/samplesheet.csv --outdir ./Dada-0323-Micheli2020 --aligner cellranger --fasta /gpfs/software/cellranger-7.1.0/refdata-gex-GRCh38-2020-A/fasta/genome.fa --gtf /gpfs/software/cellranger-7.1.0/refdata-gex-GRCh38-2020-A/genes/genes.gtf --cellranger_index /gpfs/software/cellranger-7.1.0/refdata-gex-GRCh38-2020-A

...

executor >  slurm (2)
[20/fab4cc] process > NFCORE_SCRNASEQ:SCRNASEQ:INPUT_CHECK:SAMPLESHEET_CHECK (samplesheet.csv) [  0%] 0 of 1
[-        ] process > NFCORE_SCRNASEQ:SCRNASEQ:FASTQC_CHECK:FASTQC                             -
[2a/0bcc18] process > NFCORE_SCRNASEQ:SCRNASEQ:GTF_GENE_FILTER (genome.fa)                     [  0%] 0 of 1
[-        ] process > NFCORE_SCRNASEQ:SCRNASEQ:CELLRANGER_ALIGN:CELLRANGER_COUNT               -
[-        ] process > NFCORE_SCRNASEQ:SCRNASEQ:MTX_CONVERSION:MTX_TO_H5AD                      -
[-        ] process > NFCORE_SCRNASEQ:SCRNASEQ:MTX_CONVERSION:CONCAT_H5AD                      -
[-        ] process > NFCORE_SCRNASEQ:SCRNASEQ:MTX_CONVERSION:MTX_TO_SEURAT                    -
[-        ] process > NFCORE_SCRNASEQ:SCRNASEQ:CUSTOM_DUMPSOFTWAREVERSIONS                     -
[-        ] process > NFCORE_SCRNASEQ:SCRNASEQ:MULTIQC                                         -
WARN: There's no process matching config selector: NFCORE_SCRNASEQ:SCRNASEQ:SCRNASEQ_ALEVIN:ALEVINQC
Error executing process > 'NFCORE_SCRNASEQ:SCRNASEQ:INPUT_CHECK:SAMPLESHEET_CHECK (samplesheet.csv)'

Caused by:
  Process `NFCORE_SCRNASEQ:SCRNASEQ:INPUT_CHECK:SAMPLESHEET_CHECK (samplesheet.csv)` terminated with an error exit status (1)

Command executed:

  check_samplesheet.py \
      samplesheet.csv \
      samplesheet.valid.csv

executor >  slurm (2)
[20/fab4cc] process > NFCORE_SCRNASEQ:SCRNASEQ:INPUT_CHECK:SAMPLESHEET_CHECK (samplesheet.csv) [100%] 1 of 1, failed: 1 ✘

Relevant files

Attaching my .nextflow.log file and the samplesheet.

nextflow.log samplesheet.csv

System information

Nextflow version: 22.10.7.5853 Hardware: HPC Executor: Slurm Container: Singularity OS: Rocky Linux nf/core scrnaseq version: 2.2.0

fmalmeida commented 1 year ago

Hello hello, It seems that the new version introduced a small checking in the python script but was using the wrong variable for it. Could you try the new branch I created to see if it solves the problem for you?

😄

davidecarlson commented 1 year ago

Thanks for your response!

I can confirm that the sample sheet check is now passing when I use the new branch.

Appreciate the help! Best, Dave

fmalmeida commented 1 year ago

Awesome. I will wrap up the PR so it can be merged.

fmalmeida commented 1 year ago

Merged to dev. I will now close the ticket. Please re-open or create a new one if anything related to it appear.