Errors thrown out in NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:DESEQ2_DIFFERENTIAL

CrazyHsu commented 11 months ago

Description of the bug

Hello, My experimental design expects to find DEGs between different treatments within two tissues by using nf-core/differentialabundance pipeline. But I get some errors with command: nextflow run nf-core/differentialabundance -r 1.4.0 --input samplesheet.csv --contrasts sample_contrast_file.csv --matrix star_salmon/salmon.merged.gene_counts.tsv --transcript_length_matrix star_salmon/salmon.merged.transcript_lengths.tsv --gtf Zea_mays.gtf --outdir deg_analysis -profile rnaseq,docker. How can I fix it out? Thanks!

A snapshot the error follows as below, and a full .nextflow.log is attached in Relevant files section:

Dec-06 02:23:58.547 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run
Dec-06 02:23:58.547 [Task submitter] INFO  nextflow.Session - [f1/103041] Submitted process > NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:DESEQ2_DIFFERENTIAL ([id:MP_ck_pld3_knl2, variable:treatment, reference:ck_MP, target:mt_MP_pld3_knl2, blocking:])
Dec-06 02:23:58.549 [Task monitor] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:DESEQ2_DIFFERENTIAL ([id:MP_ck_pld3_mtl_knl2, variable:treatment, reference:ck_MP, target:mt_MP_pld3_mtl_knl2, blocking:]); work-dir=/data2/work/4b/71c2b327e7e3899e712551b109b818
  error [nextflow.exception.ProcessFailedException]: Process `NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:DESEQ2_DIFFERENTIAL ([id:MP_ck_pld3_mtl_knl2, variable:treatment, reference:ck_MP, target:mt_MP_pld3_mtl_knl2, blocking:])` terminated with an error exit status (1)
Dec-06 02:23:58.574 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:DESEQ2_DIFFERENTIAL ([id:MP_ck_pld3_mtl_knl2, variable:treatment, reference:ck_MP, target:mt_MP_pld3_mtl_knl2, blocking:])'

Caused by:
  Process `NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:DESEQ2_DIFFERENTIAL ([id:MP_ck_pld3_mtl_knl2, variable:treatment, reference:ck_MP, target:mt_MP_pld3_mtl_knl2, blocking:])` terminated with an error exit status (1)

Command executed [/home/crazyhsu/.nextflow/assets/nf-core/differentialabundance/./workflows/../modules/nf-core/deseq2/differential/templates/deseq_de.R]:

......

  converting counts to integer mode
  Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
    duplicate 'row.names' are not allowed
  Calls: read_delim_flexible -> read.delim -> read.table
  Execution halted

Work dir:
  /data2/work/4b/71c2b327e7e3899e712551b109b818

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
Dec-06 02:23:58.589 [Task monitor] INFO  nextflow.Session - Execution cancelled -- Finishing pending tasks before exit
Dec-06 02:23:58.610 [main] DEBUG nextflow.Session - Session await > all processes finished
Dec-06 02:23:58.642 [Actor Thread 30] DEBUG nextflow.sort.BigSort - Sort completed -- entries: 2; slices: 1; internal sort time: 0.024 s; external sort time: 0.002 s; total time: 0.026 s

......

My sample file follows as below:	sample	fastq_1	fastq_2
ck_anther_1	ck-1mm-anther-1_R1.fq.gz	ck-1mm-anther-1_R2.fq.gz	ck_anther
ck_anther_2	ck-1mm-anther-2_R1.fq.gz	ck-1mm-anther-2_R2.fq.gz	ck_anther
ck_MP_1	ck-mature-pollen-1_R1.fq.gz	ck-mature-pollen-1_R2.fq.gz	ck_MP
ck_MP_2	ck-mature-pollen-2_R1.fq.gz	ck-mature-pollen-2_R2.fq.gz	ck_MP
mt_anther_knl2_1	knl2-1mm-anther-1_R1.fq.gz	knl2-1mm-anther-1_R2.fq.gz	mt_anther_knl2
mt_anther_knl2_2	knl2-1mm-anther-2_R1.fq.gz	knl2-1mm-anther-2_R2.fq.gz	mt_anther_knl2
mt_anther_pld3_knl2_1	pld3-knl2-1mm-anther-1_R1.fq.gz	pld3-knl2-1mm-anther-1_R2.fq.gz	mt_anther_pld3_knl2
mt_anther_pld3_knl2_2	pld3-knl2-1mm-anther-2_R1.fq.gz	pld3-knl2-1mm-anther-2_R2.fq.gz	mt_anther_pld3_knl2
mt_anther_mtl_knl2_1	mtl-knl2-1mm-anther-1_R1.fq.gz	mtl-knl2-1mm-anther-1_R2.fq.gz	mt_anther_mtl_knl2
mt_anther_mtl_knl2_2	mtl-knl2-1mm-anther-2_R1.fq.gz	mtl-knl2-1mm-anther-2_R2.fq.gz	mt_anther_mtl_knl2
mt_MP_knl2_1	knl2-mature-pollen-1_R1.fq.gz	knl2-mature-pollen-1_R2.fq.gz	mt_MP_knl2
mt_MP_knl2_2	knl2-mature-pollen-2_R1.fq.gz	knl2-mature-pollen-2_R2.fq.gz	mt_MP_knl2
mt_MP_pld3_1	pld3-mature-pollen-1_R1.fq.gz	pld3-mature-pollen-1_R2.fq.gz	mt_MP_pld3
mt_MP_pld3_2	pld3-mature-pollen-2_R1.fq.gz	pld3-mature-pollen-2_R2.fq.gz	mt_MP_pld3
mt_MP_mtl_1	mtl-mature-pollen-1_R1.fq.gz	mtl-mature-pollen-1_R2.fq.gz	mt_MP_mtl
mt_MP_mtl_2	mtl-mature-pollen-2_R1.fq.gz	mtl-mature-pollen-2_R2.fq.gz	mt_MP_mtl
mt_MP_pld3_knl2_1	pld3-knl2-mature-pollen-1_R1.fq.gz	pld3-knl2-mature-pollen-1_R2.fq.gz	mt_MP_pld3_knl2
mt_MP_pld3_knl2_2	pld3-knl2-mature-pollen-2_R1.fq.gz	pld3-knl2-mature-pollen-2_R2.fq.gz	mt_MP_pld3_knl2
mt_MP_mtl_knl2_1	mtl-knl2-mature-pollen-1_R1.fq.gz	mtl-knl2-mature-pollen-1_R2.fq.gz	mt_MP_mtl_knl2
mt_MP_mtl_knl2_2	mtl-knl2-mature-pollen-2_R1.fq.gz	mtl-knl2-mature-pollen-2_R2.fq.gz	mt_MP_mtl_knl2
mt_MP_pld3_mtl_knl2_1	pld3-mtl-knl2-mature-pollen-1_R1.fq.gz	pld3-mtl-knl2-mature-pollen-1_R2.fq.gz	mt_MP_pld3_mtl_knl2
mt_MP_pld3_mtl_knl2_2	pld3-mtl-knl2-mature-pollen-2_R1.fq.gz	pld3-mtl-knl2-mature-pollen-2_R2.fq.gz	mt_MP_pld3_mtl_knl2

My contrast file follows as below:	id	variable	reference
anther_ck_knl2	treatment	ck_anther	mt_anther_knl2
anther_ck_mtl_knl2	treatment	ck_anther	mt_anther_mtl_knl2
anther_ck_pld3_knl2	treatment	ck_anther	mt_anther_pld3_knl2
MP_ck_knl2	treatment	ck_MP	mt_MP_knl2
MP_ck_pld3	treatment	ck_MP	mt_MP_pld3
MP_ck_mtl	treatment	ck_MP	mt_MP_mtl
MP_ck_pld3_knl2	treatment	ck_MP	mt_MP_pld3_knl2
MP_ck_mtl_knl2	treatment	ck_MP	mt_MP_mtl_knl2
MP_ck_pld3_mtl_knl2	treatment	ck_MP	mt_MP_pld3_mtl_knl2

The header 10 lines of my `salmon.merged.gene_counts.tsv` generated using `nf-core/rnaseq` pipeline follows as below:	gene_id	gene_name	ck_anther_1	ck_anther_2	ck_MP_1	ck_MP_2	mt_anther_knl2_1	mt_anther_knl2_2	mt_anther_mtl_knl2_1	mt_anther_mtl_knl2_2	mt_anther_pld3_knl2_1	mt_anther_pld3_knl2_2	mt_MP_knl2_1	mt_MP_knl2_2	mt_MP_mtl_1	mt_MP_mtl_2	mt_MP_mtl_knl2_1	mt_MP_mtl_knl2_2	mt_MP_pld3_1mt_MP_pld3_2	mt_MP_pld3_knl2_1	mt_MP_pld3_knl2_2	mt_MP_pld3_mtl_knl2_1	mt_MP_pld3_mtl_knl2_2
ENSRNA049437471	tRNA-Asn	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	00	0	0
ENSRNA049437473	tRNA-Thr	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	00	0	0
ENSRNA049437518	tRNA-Asn	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	00	0	0
ENSRNA049437607	tRNA-Met	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	00	0	0
ENSRNA049437614	tRNA-Gly	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	00	0	0
ENSRNA049437658	tRNA-Ala	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	00	0	0
ENSRNA049437881	tRNA-Ser	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	00	0	0
ENSRNA049437912	tRNA-Pro	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	00	0	0
ENSRNA049437967	tRNA-Lys	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	00	0	0

Command used and terminal output

nextflow run nf-core/differentialabundance -r 1.4.0 --input samplesheet.csv --contrasts sample_contrast_file.csv --matrix star_salmon/salmon.merged.gene_counts.tsv --transcript_length_matrix star_salmon/salmon.merged.transcript_lengths.tsv --gtf Zea_mays.gtf --outdir deg_analysis -profile rnaseq,docker

Relevant files

A full .nextflow.log is attached here. nextflow.log

System information

No response

CrazyHsu commented 11 months ago

@pinin4fjords Hi, Manning. Can you help me figure out what's the problem I'm facing? Any help would be highly appreciated! Thanks.

pinin4fjords commented 11 months ago

The string converting counts to integer mode tells me that the matrix read correctly. So the error is coming from https://github.com/nf-core/differentialabundance/blob/a3d664c12c4050bae2acc83b1c636dcc3546b9a5/modules/nf-core/deseq2/differential/templates/deseq_de.R#L347.

Since count matrix and gene length matrix are read in the exact same way, this suggests that your gene length matrix has different composition to the counts matrix in terms of identifiers. Please check that your gene lengths file has the same values in its first two columns (gene_id, gene_name) as the count matrix.

CrazyHsu commented 11 months ago

Hi @pinin4fjords, thanks for your quick reply. I have specify the --transcript_length_matrix with star_salmon/salmon.merged.gene_lengths.tsv intead of star_salmon/salmon.merged.transcript_lengths.tsv, and everything goes well. Thank you! :smile:

nf-core / differentialabundance