nf-core / scrnaseq

A single-cell RNAseq pipeline for 10X genomics data
https://nf-co.re/scrnaseq
MIT License
214 stars 172 forks source link

Cell Ranger ARC null path value #389

Open ChristopherBarrington opened 3 weeks ago

ChristopherBarrington commented 3 weeks ago

Description of the bug

Similar to #374 perhaps, I get a Path value cannot be null error when running --aligner cellrangerarc.

My sample sheet is:

sample,fastq_1,fastq_2,fastq_barcode,sample_type
NestinGFP8w_3,inputs/fastq/20241018_LH00442_0061_A22NY2YLT3/fastq/AHM4688A49_S204_L008_R1_001.fastq.gz,inputs/fastq/20241018_LH00442_0061_A22NY2YLT3/fastq/AHM4688A49_S204_L008_R2_001.fastq.gz,,gex
NestinGFP8w_3,inputs/fastq/20241025_LH00442_0062_A22NTH3LT3/fastq/AHM4688A49_S22_L005_R1_001.fastq.gz,inputs/fastq/20241025_LH00442_0062_A22NTH3LT3/fastq/AHM4688A49_S22_L005_R2_001.fastq.gz,,gex
NestinGFP8w_3,inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L001_R1_001.fastq.gz,inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L001_R2_001.fastq.gz,inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L001_I2_001.fastq.gz,atac
NestinGFP8w_3,inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L002_R1_001.fastq.gz,inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L002_R2_001.fastq.gz,inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L002_I2_001.fastq.gz,atac

(This confused me as well because the sample_type:gex has included atac libraries too?)

An example of an entry in ch_fastq is below:

[DUMP] [
    {
        "id": "NestinGFP8w_3",
        "expected_cells": "",
        "seq_center": "",
        "sample_type": "gex",
        "feature_type": "",
        "single_end": "false"
    },
    [
        "/inputs/fastq/20241018_LH00442_0061_A22NY2YLT3/fastq/AHM4688A49_S204_L008_R1_001.fastq.gz",
        "/inputs/fastq/20241018_LH00442_0061_A22NY2YLT3/fastq/AHM4688A49_S204_L008_R2_001.fastq.gz",
        "/inputs/fastq/20241025_LH00442_0062_A22NTH3LT3/fastq/AHM4688A49_S22_L005_R1_001.fastq.gz",
        "/inputs/fastq/20241025_LH00442_0062_A22NTH3LT3/fastq/AHM4688A49_S22_L005_R2_001.fastq.gz",
        "/inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L001_R1_001.fastq.gz",
        "/inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L001_R2_001.fastq.gz",
        "/inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L002_R1_001.fastq.gz",
        "/inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L002_R2_001.fastq.gz"
    ]
]

Since (I think) my channel has too many files and a structure that is not expected by the CELLRANGERARC_COUNT module, I suspect I have misconfigured something?

Sorry for being so vague, I am at a loss of what to try next.

Command used and terminal output

nextflow run nf-core/scrnaseq \
-revision 2.7.1 \
-config custom.config \
-profile crick \
-resume \
--input $sample_sheet \
--outdir outputs \
--aligner cellrangerarc \
--cellranger_index $index
-[nf-core/scrnaseq] Pipeline completed with errors-
WARN: Input tuple does not match tuple declaration in process `NFCORE_SCRNASEQ:SCRNASEQ:CELLRANGERARC_ALIGN:CELLRANGERARC_COUNT` -- offending value: [[id:NestinGFP8w_3, expected_cells:null, seq_center:null, sample_type:gex, feature_type:null, single_end:false], [/inputs/fastq/20241018_LH00442_0061_A22NY2YLT3/fastq/AHM4688A49_S204_L008_R1_001.fastq.gz, /inputs/fastq/20241018_LH00442_0061_A22NY2YLT3/fastq/AHM4688A49_S204_L008_R2_001.fastq.gz, /inputs/fastq/20241025_LH00442_0062_A22NTH3LT3/fastq/AHM4688A49_S22_L005_R1_001.fastq.gz, /inputs/fastq/20241025_LH00442_0062_A22NTH3LT3/fastq/AHM4688A49_S22_L005_R2_001.fastq.gz, /inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L001_R1_001.fastq.gz, /inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L001_R2_001.fastq.gz, /inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L002_R1_001.fastq.gz, /inputs/fastq/20241008_LH00442_0059_A223NY3LT1/fastq/AHM4688A50_S3_L002_R2_001.fastq.gz]]
ERROR ~ Error executing process > 'NFCORE_SCRNASEQ:SCRNASEQ:CELLRANGERARC_ALIGN:CELLRANGERARC_COUNT (1)'

Caused by:
  Path value cannot be null

Relevant files

No response

System information

RPSeaman commented 2 weeks ago

I also just came across this issue, except of the FASTQC step. Couple of additional pieces that I saw that may potentially be helpful:

This warning produced almost immediately upon launch

WARN: Samplesheet warnings:
    The samplesheet contains following unchecked field(s): [fastq_barcode]

And the error which looks the same except on FASTQC instead of CELLRANGERARC_COUNT

ERROR ~ Error executing process > 'NFCORE_SCRNASEQ:SCRNASEQ:FASTQC_CHECK:FASTQC (4)'

Caused by:
  Path value cannot be null

My sample sheet looks the same except for my read naming is the alternative option for atac:

sample,R1,R3,R2,atac

instead of

sample,R1,R2,I2,atacf

Thanks.