nf-core / rnasplice

rnasplice is a bioinformatics pipeline for RNA-seq alternative splicing analysis
https://nf-co.re/rnasplice
MIT License
44 stars 24 forks source link

`NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DEXSEQ_DTU (1)` #137

Closed SergioManzano10 closed 5 months ago

SergioManzano10 commented 5 months ago

Description of the bug

I am trying to apply RNA splicing pipeline for DEU approach. However, I get this error which refers to some DTU approach, which I don't want to do so far (I will do it later).

Therefore, I don't understand what the pipeline is trying to do or how I could solve it.

Thank you.

Command used and terminal output

ERROR ~ Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DEXSEQ_DTU (1)'

Caused by:
  Process `NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DEXSEQ_DTU (1)` terminated with an error exit status (1)

Command executed:

  run_dexseq_dtu.R samples.tsv Contrast_sheet_rnasplice.csv counts.tsv 10

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DEXSEQ_DTU":
      r-base: $(echo $(R --version 2>&1) | sed 's/^.*R version //; s/ .*$//')
      bioconductor-dexseq:  $(Rscript -e "library(DEXSeq); cat(as.character(packageVersion('DEXSeq')))")
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
      anyMissing, rowMedians

  Attaching package: 'MatrixGenerics'

  The following objects are masked from 'package:matrixStats':

      colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
      colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
      colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
      colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
      colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
      colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
      colWeightedMeans, colWeightedMedians, colWeightedSds,
      colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
      rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
      rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
      rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
      rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
      rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
      rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
      rowWeightedSds, rowWeightedVars

  The following object is masked from 'package:Biobase':

      rowMedians

  Loading required package: GenomicRanges
  Loading required package: stats4
  Loading required package: S4Vectors

  Attaching package: 'S4Vectors'

  The following object is masked from 'package:base':

      expand.grid

  Loading required package: IRanges
  Loading required package: GenomeInfoDb
  Loading required package: DESeq2
  Loading required package: AnnotationDbi
  Loading required package: RColorBrewer
  converting counts to integer mode
  Warning message:
  In DESeqDataSet(rse, design, ignoreRank = TRUE) :
    some variables in design formula are characters, converting to factors
  Error in `$<-.data.frame`(`*tmp*`, "dispersion", value = NA) :
    replacement has 1 row, data has 0
  Calls: mapply ... colData<- -> makeBigModelFrame -> $<- -> $<-.data.frame
  Execution halted

Relevant files

No response

System information

No response

jma1991 commented 5 months ago

Hi @SergioManzano10

Please provide your command, nextflow config, and log file of your run.

SergioManzano10 commented 5 months ago

I have also atached two screenshots regarding to the contrast sheet and sample sheet (in this last one, there are control samples that are not shown in the image)

Command: cmd="nextflow run nf-core/rnasplice --input $samples --contrasts $contrasts --outdir $outdir --fasta $genome --gtf $gtf --miso_genes ENSG00000185379.21 --edger_exon -profile singularity -c nextflow.config -resume"

Nextflow config:

params {
  config_profile_description = 'bioinfo config'
  config_profile_contact = 'Sergio Manzano sergiomanzano@vhio.net'
  config_profile_url = "tobecopiedingithub"
}
singularity {
  enabled = true
  autoMounts = true
  cacheDir="cache/"
}
executor {
  name = "slurm"
  queueSize = 12
}

process {
  executor = 'slurm'
  queue    =  { task.time <= 5.h && task.memory <= 10.GB ? 'short': (task.memory <= 70.GB ? 'long' : 'highmem')}

withName: 'SAMTOOLS_SORT' {
                memory = 250.GB
                cpus = 24 }
}

params {
  max_memory = 175.GB
  max_cpus = 24
  max_time = 240.h
}

Log file:

ERROR ~ Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DEXSEQ_DTU (1)'

Caused by:
  Process `NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DEXSEQ_DTU (1)` terminated with an error exit status (1)

Command executed:

  run_dexseq_dtu.R samples.tsv Contrast_sheet_rnasplice.csv counts.tsv 10

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DEXSEQ_DTU":
      r-base: $(echo $(R --version 2>&1) | sed 's/^.*R version //; s/ .*$//')
      bioconductor-dexseq:  $(Rscript -e "library(DEXSeq); cat(as.character(packageVersion('DEXSeq')))")
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
      anyMissing, rowMedians

  Attaching package: 'MatrixGenerics'

  The following objects are masked from 'package:matrixStats':

      colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
      colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
      colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
      colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
      colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
      colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
      colWeightedMeans, colWeightedMedians, colWeightedSds,
      colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
      rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
      rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
      rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
      rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
      rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
      rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
      rowWeightedSds, rowWeightedVars

  The following object is masked from 'package:Biobase':

      rowMedians

  Loading required package: GenomicRanges
  Loading required package: stats4
  Loading required package: S4Vectors

  Attaching package: 'S4Vectors'

  The following object is masked from 'package:base':

      expand.grid

  Loading required package: IRanges
  Loading required package: GenomeInfoDb
  Loading required package: DESeq2
  Loading required package: AnnotationDbi
  Loading required package: RColorBrewer
  converting counts to integer mode
  Warning message:
  In DESeqDataSet(rse, design, ignoreRank = TRUE) :
    some variables in design formula are characters, converting to factors
  Error in `$<-.data.frame`(`*tmp*`, "dispersion", value = NA) :
    replacement has 1 row, data has 0
  Calls: mapply ... colData<- -> makeBigModelFrame -> $<- -> $<-.data.frame
  Execution halted

contrast_sheet sample_sheet

jma1991 commented 5 months ago

Is that your full nextflow.config file?

SergioManzano10 commented 5 months ago

No, sorry. The global one is this:

// Global default params, used in configs
params {

    // Input options
    input                      = null
    contrasts                  = null
    source                     = 'fastq'

    // References
    genome                     = null
    transcript_fasta           = null
    gtf_extra_attributes       = 'gene_name'
    gtf_group_features         = 'gene_id'
    gencode                    = false
    save_reference             = false
    igenomes_base              = 's3://ngi-igenomes/igenomes'
    igenomes_ignore            = false

    // Trimming
    clip_r1                    = null
    clip_r2                    = null
    three_prime_clip_r1        = null
    three_prime_clip_r2        = null
    trim_nextseq               = null
    save_trimmed               = false
    skip_trimming              = false
    skip_trimgalore_fastqc     = false
    min_trimmed_reads          = 10000

    // Alignment
    aligner                    = 'star_salmon'
    pseudo_aligner             = 'salmon'
    bam_csi_index              = false
    seq_center                 = null
    salmon_quant_libtype       = null
    star_ignore_sjdbgtf        = false
    skip_alignment             = false
    save_unaligned             = false
    save_align_intermeds       = false
    save_merged_fastq          = false

    // QC
    skip_bigwig                = true
    skip_fastqc                = false

    // rMATs
    rmats                      = true
    rmats_splice_diff_cutoff   = 0.0001
    rmats_paired_stats         = true
    rmats_read_len             = 40
    rmats_novel_splice_site    = false
    rmats_min_intron_len       = 50
    rmats_max_exon_len         = 500

    // DEXSeq DEU
    dexseq_exon                = true
    save_dexseq_annotation     = false
    gff_dexseq                 = null
    alignment_quality          = 10
    aggregation                = true
    save_dexseq_plot           = true
    n_dexseq_plot              = 10

    // edgeR DEU
    edger_exon                 = true
    save_edger_plot            = true
    n_edger_plot               = 10

    // DEXSeq DTU
    dexseq_dtu                 = true
    dtu_txi                    = 'dtuScaledTPM'

    // Miso
    sashimi_plot               = true
    miso_genes                 = 'ENSG00000004961, ENSG00000005302, ENSG00000147403'
    miso_genes_file            = null
    miso_read_len              = 75
    fig_width                  = 7
    fig_height                 = 7

    // DRIMSeq Filtering
    min_samps_feature_expr     =  2
    min_samps_feature_prop     =  2
    min_samps_gene_expr        =  4
    min_feature_expr           =  10
    min_feature_prop           =  0.1
    min_gene_expr              =  10

    // SUPPA options
    suppa                      = true
    suppa_per_local_event      = true
    suppa_per_isoform          = true
    suppa_tpm                  = null

    // SUPPA Generate events options
    generateevents_pool_genes  = true
    generateevents_event_type  = 'SE SS MX RI FL'
    generateevents_boundary    = 'S'
    generateevents_threshold   = 10
    generateevents_exon_length = 100
    psiperevent_total_filter   = 0

    // SUPPA Diffsplice options
    diffsplice_local_event     = true
    diffsplice_isoform         = true
    diffsplice_method          = 'empirical'
    diffsplice_area            = 1000
    diffsplice_lower_bound     = 0
    diffsplice_gene_correction = true
    diffsplice_paired          = true
    diffsplice_alpha           = 0.05
    diffsplice_median          = false
    diffsplice_tpm_threshold   = 0
    diffsplice_nan_threshold   = 0

    // SUPPA Cluster options
    clusterevents_local_event  = true
    clusterevents_isoform      = true
    clusterevents_sigthreshold = null
    clusterevents_dpsithreshold= 0.05
    clusterevents_eps          = 0.05
    clusterevents_metric       = 'euclidean'
    clusterevents_separation   = null
    clusterevents_min_pts      = 20
    clusterevents_method       = 'DBSCAN'

    // MultiQC options
    multiqc_config             = null
    multiqc_title              = null
    multiqc_logo               = null
    max_multiqc_email_size     = '25.MB'
    multiqc_methods_description = null

    // Boilerplate options
    outdir                     = null
    publish_dir_mode           = 'copy'
    email                      = null
    email_on_fail              = null
    plaintext_email            = false
    monochrome_logs            = false
    hook_url                   = null
    help                       = false
    version                    = false

    // Config options
    config_profile_name        = null
    config_profile_description = null
    custom_config_version      = 'master'
    custom_config_base         = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}"
    config_profile_contact     = null
    config_profile_url         = null

    // Max resource options
    // Defaults only, expecting to be overwritten
    max_memory                 = '128.GB'
    max_cpus                   = 16
    max_time                   = '240.h'

    // Schema validation default options
    validationFailUnrecognisedParams = false
    validationLenientMode            = false
    validationSchemaIgnoreParams     = 'genomes,igenomes_base'
    validationShowHiddenParams       = false
    validate_params                  = true

}

// Load base.config by default for all pipelines
includeConfig 'conf/base.config'

// Load nf-core custom profiles from different Institutions
try {
    includeConfig "${params.custom_config_base}/nfcore_custom.config"
} catch (Exception e) {
    System.err.println("WARNING: Could not load nf-core/config profiles: ${params.custom_config_base}/nfcore_custom.config")
}

// Load nf-core/rnasplice custom profiles from different institutions.
// Warning: Uncomment only if a pipeline-specific instititutional config already exists on nf-core/configs!
// try {
//   includeConfig "${params.custom_config_base}/pipeline/rnasplice.config"
// } catch (Exception e) {
//   System.err.println("WARNING: Could not load nf-core/config/rnasplice profiles: ${params.custom_config_base}/pipeline/rnasplice.config")
// }

profiles {
    debug {
        dumpHashes             = true
        process.beforeScript   = 'echo $HOSTNAME'
        cleanup                = false
    }
    conda {
        conda.enabled          = true
        docker.enabled         = false
        singularity.enabled    = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
        apptainer.enabled      = false
    }
    mamba {
        conda.enabled          = true
        conda.useMamba         = true
        docker.enabled         = false
        singularity.enabled    = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
        apptainer.enabled      = false
    }
    docker {
        docker.enabled         = true
        docker.userEmulation   = true
        conda.enabled          = false
        singularity.enabled    = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
        apptainer.enabled      = false
    }
    arm {
        docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64'
    }
    singularity {
        singularity.enabled    = true
        singularity.autoMounts = true
        conda.enabled          = false
        docker.enabled         = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
        apptainer.enabled      = false
    }
    podman {
        podman.enabled         = true
        conda.enabled          = false
        docker.enabled         = false
        singularity.enabled    = false
        shifter.enabled        = false
        charliecloud.enabled   = false
        apptainer.enabled      = false
    }
    shifter {
        shifter.enabled        = true
        conda.enabled          = false
        docker.enabled         = false
        singularity.enabled    = false
        podman.enabled         = false
        charliecloud.enabled   = false
        apptainer.enabled      = false
    }
    charliecloud {
        charliecloud.enabled   = true
        conda.enabled          = false
        docker.enabled         = false
        singularity.enabled    = false
        podman.enabled         = false
        shifter.enabled        = false
        apptainer.enabled      = false
    }
    apptainer {
        apptainer.enabled      = true
        apptainer.autoMounts   = true
        conda.enabled          = false
        docker.enabled         = false
        singularity.enabled    = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
    }
    gitpod {
        executor.name          = 'local'
        executor.cpus          = 4
        executor.memory        = 8.GB
    }
    test                   { includeConfig 'conf/test.config'                   }
    test_full              { includeConfig 'conf/test_full.config'              }
    test_edger             { includeConfig 'conf/test_edger.config'             }
    test_rmats             { includeConfig 'conf/test_rmats.config'             }
    test_dexseq            { includeConfig 'conf/test_dexseq.config'            }
    test_suppa             { includeConfig 'conf/test_suppa.config'             }
    test_fastq             { includeConfig 'conf/test_fastq.config'             }
    test_genome_bam        { includeConfig 'conf/test_genome_bam.config'        }
    test_transcriptome_bam { includeConfig 'conf/test_transcriptome_bam.config' }
    test_salmon_results    { includeConfig 'conf/test_salmon_results.config'    }
}

// Set default registry for Apptainer, Docker, Podman and Singularity independent of -profile
// Will not be used unless Apptainer / Docker / Podman / Singularity are enabled
// Set to your registry if you have a mirror of containers
apptainer.registry   = 'quay.io'
docker.registry      = 'quay.io'
podman.registry      = 'quay.io'
singularity.registry = 'quay.io'

// Nextflow plugins
plugins {
    id 'nf-validation' // Validation of pipeline parameters and creation of an input channel from a sample sheet
}

// Load igenomes.config if required
if (!params.igenomes_ignore) {
    includeConfig 'conf/igenomes.config'
} else {
    params.genomes = [:]
}

// Export these variables to prevent local Python/R libraries from conflicting with those in the container
// The JULIA depot path has been adjusted to a fixed path `/usr/local/share/julia` that needs to be used for packages in the container.
// See https://apeltzer.github.io/post/03-julia-lang-nextflow/ for details on that. Once we have a common agreement on where to keep Julia packages, this is adjustable.

env {
    PYTHONNOUSERSITE = 1
    R_PROFILE_USER   = "/.Rprofile"
    R_ENVIRON_USER   = "/.Renviron"
    JULIA_DEPOT_PATH = "/usr/local/share/julia"
}

// Capture exit codes from upstream processes when piping
process.shell = ['/bin/bash', '-euo', 'pipefail']

def trace_timestamp = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')
timeline {
    enabled = true
    file    = "${params.outdir}/pipeline_info/execution_timeline_${trace_timestamp}.html"
}
report {
    enabled = true
    file    = "${params.outdir}/pipeline_info/execution_report_${trace_timestamp}.html"
}
trace {
    enabled = true
    file    = "${params.outdir}/pipeline_info/execution_trace_${trace_timestamp}.txt"
}
dag {
    enabled = true
    file    = "${params.outdir}/pipeline_info/pipeline_dag_${trace_timestamp}.html"
}

manifest {
    name            = 'nf-core/rnasplice'
    author          = """Ben Southgate, James Ashmore"""
    homePage        = 'https://github.com/nf-core/rnasplice'
    description     = """Alternative splicing analysis using RNA-seq."""
    mainScript      = 'main.nf'
    nextflowVersion = '!>=23.04.0'
    version         = '1.0.1'
    doi             = '10.5281/zenodo.8424632'
}

// Load modules.config for DSL2 module specific options
includeConfig 'conf/modules.config'

// Function to ensure that resource requirements don't go beyond
// a maximum limit
def check_max(obj, type) {
    if (type == 'memory') {
        try {
            if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1)
                return params.max_memory as nextflow.util.MemoryUnit
            else
                return obj
        } catch (all) {
            println "   ### ERROR ###   Max memory '${params.max_memory}' is not valid! Using default value: $obj"
            return obj
        }
    } else if (type == 'time') {
        try {
            if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1)
                return params.max_time as nextflow.util.Duration
            else
                return obj
        } catch (all) {
            println "   ### ERROR ###   Max time '${params.max_time}' is not valid! Using default value: $obj"
            return obj
        }
    } else if (type == 'cpus') {
        try {
            return Math.min( obj, params.max_cpus as int )
        } catch (all) {
            println "   ### ERROR ###   Max cpus '${params.max_cpus}' is not valid! Using default value: $obj"
            return obj
        }
    }
}
jma1991 commented 5 months ago

By default, the workflow executes all available downstream analysis tools. In the complete nextflow.config file, each tool is controlled by a variable that determines whether it runs (e.g., rmats = true). Your command line argument sets edger_exon to true, which is the default setting. However, since you're using all the settings defined in nextflow.config, all other tools will also run. To run only the specific tool you’re interested in, you should set all other tools to false in the config file.

SergioManzano10 commented 5 months ago

Okay, thank you!

However, I am facing a new problem that I have seen that is not previously solved:

ERROR ~ Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DRIMSEQ_FILTER (1)'

Caused by:
  Process `NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DRIMSEQ_FILTER (1)` terminated with an error exit status (1)

Command executed:

  run_drimseq_filter.R salmon.merged.txi.dtu.rds tximport.tx2gene.tsv prueba_sample_sheet_rnasplice.csv \
      4 \
      2 \
      2 \
      10 \
      0.1 \
      10

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASPLICE:RNASPLICE:DRIMSEQ_DEXSEQ_DTU_SALMON:DRIMSEQ_FILTER":
      r-base: $(echo $(R --version 2>&1) | sed 's/^.*R version //; s/ .*$//')
      bioconductor-drimseq: $(Rscript -e "library(DRIMSeq); cat(as.character(packageVersion('DRIMSeq')))")
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  WARNING: Skipping mount /usr/local/var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container

  Attaching package: 'DRIMSeq'

  The following object is masked from 'package:base':

      proportions

  Error in .local(x, ...) :
    min_samps_gene_expr >= 0 && min_samps_gene_expr <= ncol(x@counts) is not TRUE
  Calls: <Anonymous> -> <Anonymous> -> .local -> stopifnot
  Execution halted
jma1991 commented 5 months ago

This new problem looks to be a copy of #126 - I will close this issue as your original problem has been solved. Please monitor the other thread for updates.