nf-core / differentialabundance

Differential abundance analysis for feature/ observation matrices from platforms such as RNA-seq
https://nf-co.re/differentialabundance
MIT License
61 stars 34 forks source link

Null device 1 OR error in nsub parameter #155

Closed alexmascension closed 10 months ago

alexmascension commented 1 year ago

Description of the bug

Hi, I'm running your pipeline after running the nfcore/smrnaseq pipeline for some miRNA detection. Currently, I'm stuck at the NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:DESEQ2_DIFFERENTIAL step.

The error reports two things:

Saving results for Normal_Full_vs_High_fat_Full ...
  null device 
            1

Which I guess is some problem with saving the file, but I don't know exactly what it is. The saving path does not contain any strange characters. running with the -profile docker,test works also fine.

The second problem may be related to

Error in vst(dds, blind = opt$vs_blind, nsub = opt$vst_nsub) : 
    less than 'nsub' rows with mean normalized count > 5, 
    it is recommended to use varianceStabilizingTransformation directly
  Execution halted

I guess this is because the counts file is short (it contains ~1300 entries/genes). However, I've tried making the --deseq_vst_nsub smaller (up to 10) and it still gives the same error.

Command used and terminal output

COMMAND

nextflow run nf-core/differentialabundance \
-profile docker \
     --input "$PATH_DIR/samplesheet_merged_differentialabundance.csv" \
     --contrasts "$PATH_DIR/contrasts.csv" \
     --matrix "$PATH_DIR/mature_counts_transposed.csv" \
     --gtf "$PWD/data/gtfs/mmu.gtf" \
     --features_name_col gene_id \
     --outdir "$PATH_DIR/differentialabundance/mature" \
     -resume

ERROR (CUT TO RELEANT PART)
Command exit status:
  1

Command output:
  Saving results for Normal_Full_vs_High_fat_Full ...
  null device 
            1 

Command error:
      colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
      colWeightedMeans, colWeightedMedians, colWeightedSds,
      colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
      rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
      rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
      rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
      rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
      rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
      rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
      rowWeightedSds, rowWeightedVars

  Loading required package: Biobase
  Welcome to Bioconductor

      Vignettes contain introductory material; view with
      'browseVignettes()'. To cite Bioconductor, see
      'citation("Biobase")', and for packages 'citation("pkgname")'.

  Attaching package: ‘Biobase’

  The following object is masked from ‘package:MatrixGenerics’:

      rowMedians

  The following objects are masked from ‘package:matrixStats’:

      anyMissing, rowMedians

  converting counts to integer mode
  estimating size factors
  estimating dispersions
  gene-wise dispersion estimates: 1 workers
  mean-dispersion relationship
  final dispersion estimates, fitting model and testing: 1 workers
  -- replacing outliers and refitting for 312 genes
  -- DESeq argument 'minReplicatesForReplace' = 7 
  -- original counts are preserved in counts(dds)
  estimating dispersions
  fitting model and testing
  using 'ashr' for LFC shrinkage. If used in published research, please cite:
      Stephens, M. (2016) False discovery rates: a new deal. Biostatistics, 18:2.
      https://doi.org/10.1093/biostatistics/kxw041
  Saving results for Normal_Full_vs_High_fat_Full ...
  null device 
            1 
  Error in vst(dds, blind = opt$vs_blind, nsub = opt$vst_nsub) : 
    less than 'nsub' rows with mean normalized count > 5, 
    it is recommended to use varianceStabilizingTransformation directly
  Execution halted

Relevant files

contrasts.csv samplesheet_merged_differentialabundance.csv mature_counts_transposed.csv mmu.zip nextflow.log

System information

Nextflow version: 23.04.1.5866 Hardware: Desktop Executor: local Container engine: Docker OS: Ubuntu 22.04 Version of nf-core/smrnaseq: v2.2.1-gf7022ab Version of nf-core/differentialabundance: v1.2.0

alexmascension commented 1 year ago

Hi! Any news on this? Thanks!

pinin4fjords commented 1 year ago

Sorry for the slow response, it's been a complicated summer for me.

First thing to say is that we have not optimised this workflow for use with small RNAs, and other people have had similar issues, related I believe to the sparsity of these matrices. The error message here is probably bang-on, in that the internal filtering that happens here is excluding all your rows due to the low counts.

You'll probably need to switch deseq2_vs_method to rlog to make this work at all, but I lack the small RNA experience to comment on the likely results.

alexmascension commented 1 year ago

Nice! I'll give it a try to see what we can do about it. Thanks!

pinin4fjords commented 10 months ago

Closing for now- feel free to reopen with any followup