Error in dmDSdata(counts = counts, samples = coldata)

Riczury commented 1 month ago

Operating System

Windows 10

Other Linux

No response

Workflow Version

v1.2.1

Workflow Execution

EPI2ME Desktop (Local)

Other workflow execution

No response

EPI2ME Version

v5.1.14

CLI command run

epi2me-labs/wf-transcriptomes v1.2.1.

Workflow Execution - CLI Execution Profile

standard (default)

What happened?

I want to do a differential expression analysis but I can't because it gives me this error.

Relevant log output

N E X T F L O W  ~  version 23.04.2
Launching `/mnt/c/Users/juanj/epi2melabs/workflows/epi2me-labs/wf-transcriptomes/main.nf` [romantic_brown] DSL2 - revision: 3ecc85aafe
||||||||||   _____ ____ ___ ____  __  __ _____      _       _
||||||||||  | ____|  _ \_ _|___ \|  \/  | ____|    | | __ _| |__  ___
|||||       |  _| | |_) | |  __) | |\/| |  _| _____| |/ _` | '_ \/ __|
|||||       | |___|  __/| | / __/| |  | | |__|_____| | (_| | |_) \__ \
||||||||||  |_____|_|  |___|_____|_|  |_|_____|    |_|\__,_|_.__/|___/
||||||||||  wf-transcriptomes v1.2.1
--------------------------------------------------------------------------------
Core Nextflow options
  runName             : romantic_brown
  containerEngine     : docker
  launchDir           : /mnt/c/Users/juanj/epi2melabs/instances/wf-transcriptomes_01J3GV6PZZKAF5QEH57S1PFWVA
  workDir             : /mnt/c/Users/juanj/epi2melabs/instances/wf-transcriptomes_01J3GV6PZZKAF5QEH57S1PFWVA/work
  projectDir          : /mnt/c/Users/juanj/epi2melabs/workflows/epi2me-labs/wf-transcriptomes
  userName            : epi2mewsl
  profile             : standard
  configFiles         : /mnt/c/Users/juanj/epi2melabs/workflows/epi2me-labs/wf-transcriptomes/nextflow.config
Input Options
  fastq               : /mnt/c/Users/juanj/Desktop/maestria ric/Mezclados/RNAseq
  transcriptome_source: precomputed
  ref_transcriptome   : /mnt/c/Users/juanj/Desktop/maestria ric/RNAseq_transcriptome.fas
  ref_annotation      : /mnt/c/Users/juanj/Desktop/maestria ric/genomic.gff3
  direct_rna          : true
Output Options
  out_dir             : /mnt/c/Users/juanj/epi2melabs/instances/wf-transcriptomes_01J3GV6PZZKAF5QEH57S1PFWVA/output
Sample Options
  sample_sheet        : /mnt/c/Users/juanj/Desktop/maestria ric/Sample_sheet.csv
Gene Fusion Detection Options
  jaffal_dir          : /mnt/c/home/epi2melabs/JAFFA
Differential Expression Options
  de_analysis         : true
!! Only displaying parameters that differ from the pipeline defaults !!
--------------------------------------------------------------------------------
If you use epi2me-labs/wf-transcriptomes for your analysis please cite:
* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x
--------------------------------------------------------------------------------
This is epi2me-labs/wf-transcriptomes v1.2.1.
--------------------------------------------------------------------------------
Reference Transcriptome provided will be used for differential expression.
Searching input for [.fastq, .fastq.gz, .fq, .fq.gz] files.
[db/e0eb07] Submitted process > pipeline:preprocess_ref_annotation
[43/a84148] Submitted process > validate_sample_sheet
[8c/a62576] Submitted process > pipeline:differential_expression:checkSampleSheetCondition
[fd/79b1fb] Submitted process > pipeline:getVersions
[92/d1ed9f] Submitted process > pipeline:preprocess_ref_transcriptome (1)
[04/2b849d] Submitted process > pipeline:getParams
[19/8da516] Submitted process > fastcat (1)
[be/d85c19] Submitted process > fastcat (2)
[3e/195bee] Submitted process > pipeline:differential_expression:build_minimap_index_transcriptome (1)
[d5/7cf2be] Submitted process > fastcat (3)
[ac/8a3202] Submitted process > pipeline:collectFastqIngressResultsInDir (1)
[aa/fd18b6] Submitted process > pipeline:collectFastqIngressResultsInDir (2)
[a0/dbf0b7] Submitted process > fastcat (4)
[f4/a5b383] Submitted process > pipeline:check_annotation_strand
[ee/343391] Submitted process > pipeline:differential_expression:map_transcriptome (1)
[2b/2b122a] Submitted process > pipeline:collectFastqIngressResultsInDir (3)
[10/3411da] Submitted process > pipeline:differential_expression:map_transcriptome (2)
[a7/68c692] Submitted process > pipeline:collectFastqIngressResultsInDir (4)
[6e/348561] Submitted process > pipeline:differential_expression:map_transcriptome (3)
[0d/a3f942] Submitted process > pipeline:differential_expression:map_transcriptome (4)
[9b/806209] Submitted process > pipeline:differential_expression:count_transcripts (1)
[65/7ffc8e] Submitted process > pipeline:differential_expression:count_transcripts (2)
[ec/3b9c68] Submitted process > pipeline:differential_expression:count_transcripts (3)
[d8/32209f] Submitted process > pipeline:differential_expression:count_transcripts (4)
[d5/d1bbae] Submitted process > pipeline:differential_expression:mergeCounts
[fa/2d037a] Submitted process > pipeline:differential_expression:mergeTPM
[a5/4b904d] Submitted process > pipeline:differential_expression:deAnalysis
ERROR ~ Error executing process > 'pipeline:differential_expression:deAnalysis'
Caused by:
  Process `pipeline:differential_expression:deAnalysis` terminated with an error exit status (1)
Command executed:
  mkdir merged
  mkdir de_analysis
  de_analysis.R annotation.gtf 3 1 10 3
Command exit status:
  1
Command output:
  Loading counts, conditions and parameters.
  Checking annotation file type.
  Annotation file type is gtf.
  Checking annotation file for presence of transcript_id versions.
  Annotation file transcript_ids do not include versions so also strip versions from the counts df.
  Loading annotation database.
  Filtering counts using DRIMSeq.
Command error:
  Loading counts, conditions and parameters.
  Checking annotation file type.
  Annotation file type is gtf.
  Checking annotation file for presence of transcript_id versions.
  Annotation file transcript_ids do not include versions so also strip versions from the counts df.
  Warning message:
  closing unused connection 3 (annotation.gtf) 
  Loading annotation database.
  Import genomic features from the file as a GRanges object ... OK
  Prepare the 'metadata' data frame ... OK
  Make the TxDb object ... OK
  Warning message:
  In .local(con, format, text, ...) :
    gff-version directive indicates version is 3, not 2
  Filtering counts using DRIMSeq.
  Error in dmDSdata(counts = counts, samples = coldata) : 
    mode(counts) %in% "numeric" is not TRUE
  Calls: dmDSdata -> stopifnot
  Execution halted
Work dir:
  /mnt/c/Users/juanj/epi2melabs/instances/wf-transcriptomes_01J3GV6PZZKAF5QEH57S1PFWVA/work/a5/4b904ddecf39decedf9000e952c1f9
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
 -- Check '/mnt/c/Users/juanj/epi2melabs/instances/wf-transcriptomes_01J3GV6PZZKAF5QEH57S1PFWVA/nextflow.log' file for details

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

no

Other demo data information

No response

sarahjeeeze commented 1 month ago

Hi, thanks for reporting this. We will look in to it. In the meantime you may like to try the workflow again but with a different path for sample_sheet.csv and reference annotation files that do not contain a space eg. /mnt/c/Users/juanj/Desktop/maestria ric/Sample_sheet.csv.

We do try to make all of our workflows work for paths where there is a space but it can still be the source of some errors.

Riczury commented 1 month ago

Thank you very much, I just did what you recommended but the same error keeps coming up.

sarahjeeeze commented 1 month ago

Would you mind sharing your sample sheet? Sample_sheet.csv and if possible if you click 'open folder' in the app of the workflow - it should open a folder and there you will find a directory /work/a5/4b904ddecf39decedf9000e952c1f9 in the de_analysis folder there should be a counts.tsv file - does it contain counts, if possible could you share that?

Riczury commented 1 month ago

Sure, here is a picture. I tried to upload the files but says that it doesn't accept that format. image_2024-08-01_104120912 And this is the sample sheet Samplesheet.csv I'll be happy to provide more information if needed to resolve the issue.

sarahjeeeze commented 1 month ago

hmm that all looks fine, would you be able to share either your input RNAseq_transcriptome.fas and genomic.gff3 - you can put them in a zip folder to share via github or if not cp and paste just a small section of each of the files or confirm that the references used in each match

sarahjeeeze commented 1 month ago

and if you open any of those transcript_counts.tsv they contain counts against the records?

Riczury commented 1 month ago

Sure here you have. And I can't open transcript_counts.tsv files they are empty Anotacion_y_Transcriptoma_ref.zip

sarahjeeeze commented 1 month ago

Hey, so in the precomputed mode the references used in the gff need to match up with the references in the ref_annotation (the first column) - i didn't find eg. RNAseq_batch_1.2.1 in the reference annotation.

Riczury commented 1 month ago

I didn't do i. I tried do it but this happened.

N E X T F L O W ~ version 23.04.2 Launching `/mnt/c/Users/juanj/epi2melabs/workflows/epi2me-labs/wf-transcriptomes/main.nf` [DE1] DSL2 - revision: 3ecc85aafe |||||||||| ____ _ __ _ |||||||||| | _| _ | | \/ | __| | | | | ||||| | | | |) | | ) | |\/| | _| ___| |/ ` | ' \/ | ||||| | |_| /| | / /| | | | |_|| | (| | |) _ \ |||||||||| |____|_| |_|___|| ||| ||\,|._/|/ |||||||||| wf-transcriptomes v1.2.1

Core Nextflow options runName : DE_1 containerEngine: docker launchDir : /mnt/c/Users/juanj/epi2melabs/instances/wf-transcriptomes_01J4S7H51523WSG4XGFD1EJHX1 workDir : /mnt/c/Users/juanj/epi2melabs/instances/wf-transcriptomes_01J4S7H51523WSG4XGFD1EJHX1/work projectDir : /mnt/c/Users/juanj/epi2melabs/workflows/epi2me-labs/wf-transcriptomes userName : epi2mewsl profile : standard configFiles : /mnt/c/Users/juanj/epi2melabs/workflows/epi2me-labs/wf-transcriptomes/nextflow.config Input Options fastq : /mnt/c/Users/juanj/Desktop/maestri_ric/Mezclados/cDNA ref_genome : /mnt/c/Users/juanj/Desktop/maestri_ric/ncbi_dataset/ncbi_dataset/data/GCA_026122715.1/GCA_026122715.1_ASM2612271v1_genomic.fna ref_annotation : /mnt/c/Users/juanj/Desktop/maestri_ric/anotacion.gff2 direct_rna : true Output Options out_dir : /mnt/c/Users/juanj/epi2melabs/instances/wf-transcriptomes_01J4S7H51523WSG4XGFD1EJHX1/output Gene Fusion Detection Options jaffal_dir : /mnt/c/home/epi2melabs/JAFFA !! Only displaying parameters that differ from the pipeline defaults !!

If you use epi2me-labs/wf-transcriptomes for your analysis please cite:

The nf-core framework https://doi.org/10.1038/s41587-020-0439-x

This is epi2me-labs/wf-transcriptomes v1.2.1.

Searching input for [.fastq, .fastq.gz, .fq, .fq.gz] files. Doing reference based transcript analysis [32/812557] Submitted process > fastcat (1) [e2/707924] Submitted process > pipeline:getVersions [cf/263520] Submitted process > pipeline:getParams [d7/5d6235] Submitted process > pipeline:preprocess_ref_annotation [9a/c6ea05] Submitted process > pipeline:build_minimap_index (1) [c4/812bb8] Submitted process > pipeline:collectFastqIngressResultsInDir (1) [14/9d3fc5] Submitted process > pipeline:reference_assembly:map_reads (1) [02/254241] Submitted process > pipeline:split_bam (1) [cc/aaaa42] Submitted process > pipeline:assemble_transcripts (1) ERROR ~ Error executing process > 'pipeline:assemble_transcripts (1)' Caused by: Process pipeline:assemble_transcripts (1) terminated with an error exit status (1) Command executed: stringtie --rf -G amended.anotacion.gff2 -L -v -p 4 --conservative -o cDNA_batch_1.gff -l cDNA_batch_1 cDNA_batch_1.bam 2>/dev/null Command exit status: 1 Command output: (empty) Work dir: /mnt/c/Users/juanj/epi2melabs/instances/wf-transcriptomes_01J4S7H51523WSG4XGFD1EJHX1/work/cc/aaaa42547138f24a2a1fea2ffe6fa9 Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out -- Check '/mnt/c/Users/juanj/epi2melabs/instances/wf-transcriptomes_01J4S7H51523WSG4XGFD1EJHX1/nextflow.log' file for details

sarahjeeeze commented 1 month ago

Sorry i misunderstood what you have tried? The workflow has only really been set up to work and tested on the ref_annotation and ref_transcriptomes files from ncbi and ensemble. Is your ref_transcriptome a custom file? And where did you get the ref_annotation file from?

Riczury commented 1 month ago

Hi, I used all the transcriptome sequences I obtained and a reference genome I downloaded from NCBI to run the transcriptome workflow to do the precomputed. From there I obtained the ref_transcriptome file. Then I tried to do the analysis using the sequences, the ref_annotation file I downloaded from NCBI, the ref_transcriptome I obtained and the sample sheet.

sarahjeeeze commented 3 weeks ago

could you link me so i can try them?

Riczury commented 3 weeks ago

Sure https://drive.google.com/drive/folders/17cSn6-CHiHt-QeuhOZSAQlTcyPp-UHvg?usp=drive_link Those are the documents that I used to do the analysis.

epi2me-labs / wf-transcriptomes