pandoc conversion failed in differential abundance test run

Christina002128 commented 9 months ago

Description of the bug

When running the differential abundance pipeline test run, the pipeline fails on the RMARKDOWNNOTEBOOK process citing a pandoc conversion failure.

Has anyone encountered this error before? It should be noted that I'm running the pipeline on the ARM apple silicon architecture which is known to have some compatibility problems with docker.

Command used and terminal output

nextflow run nf-core/differentialabundance  -r 1.4.0  -profile test,docker --outdir output -resume

  error [nextflow.exception.ProcessFailedException]: Process `NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:RMARKDOWNNOTEBOOK (SRP254919)` terminated with an error exit status (1)
Dec-20 11:44:43.460 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:RMARKDOWNNOTEBOOK (SRP254919)'

Caused by:
  Process `NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:RMARKDOWNNOTEBOOK (SRP254919)` terminated with an error exit status (1)

Command executed:

  # Dump .params.yml heredoc (section will be empty if parametrization is disabled)
  cat <<"END_PARAMS_SECTION" > ./.params.yml
  cpus: 2
  artifact_dir: artifacts
  input_dir: ./
  meta:
    id: SRP254919
  study_name: SRP254919
  study_type: rnaseq
  study_abundance_type: counts
  report_file: /Users/zhangxiaoyan/.nextflow/assets/nf-core/differentialabundance/assets/differentialabundance_report.Rmd
  report_title: null
  report_author: null
  report_description: null
  report_scree: true
  observations_type: sample
  observations_id_col: sample
  observations_name_col: sample
  features: Mus_musculus.anno.feature_metadata.tsv
  features_type: gene
  features_id_col: gene_id
  features_name_col: gene_name
  features_metadata_cols: gene_id,gene_name,gene_biotype
  features_log2_assays: null
  features_gtf_feature_type: transcript
  features_gtf_table_first_field: gene_id
  filtering_min_samples: 1
  filtering_min_abundance: 10
  filtering_min_proportion: null
  filtering_grouping_var: null
  exploratory_main_variable: contrasts
  exploratory_clustering_method: ward.D2
  exploratory_cor_method: spearman
  exploratory_n_features: 500
  exploratory_whisker_distance: 1.5
  exploratory_mad_threshold: -5
  exploratory_assay_names: raw,normalised,variance_stabilised
  exploratory_final_assay: variance_stabilised
  exploratory_palette_name: Set1
  differential_file_suffix: .deseq2.results.tsv
  differential_feature_id_column: gene_id
  differential_feature_name_column: gene_name
  differential_fc_column: log2FoldChange
  differential_pval_column: pvalue
  differential_qval_column: padj
  differential_min_fold_change: 2
  differential_max_pval: 1
  differential_max_qval: 0.05
  differential_foldchanges_logged: true
  differential_palette_name: Set1
  differential_subset_to_contrast_samples: false
  deseq2_test: Wald
  deseq2_fit_type: parametric
  deseq2_sf_type: ratio
  deseq2_min_replicates_for_replace: 7
  deseq2_use_t: false
  deseq2_lfc_threshold: 0
  deseq2_alt_hypothesis: greaterAbs
  deseq2_independent_filtering: true
  deseq2_p_adjust_method: BH
  deseq2_alpha: 0.1
  deseq2_minmu: 0.5
  deseq2_vs_method: vst
  deseq2_shrink_lfc: true
  deseq2_cores: 1
  deseq2_vs_blind: true
  deseq2_vst_nsub: 500
  gsea_run: true
  gsea_nperm: 1000
  gsea_permute: phenotype
  gsea_scoring_scheme: weighted
  gsea_metric: Signal2Noise
  gsea_sort: real
  gsea_order: descending
  gsea_set_max: 500
  gsea_set_min: 15
  gsea_norm: meandiv
  gsea_rnd_type: no_balance
  gsea_make_sets: true
  gsea_median: false
  gsea_num: 100
  gsea_plot_top_x: 20
  gsea_rnd_seed: timestamp
  gsea_save_rnd_lists: false
  gsea_zip_report: false
  gsea_gene_sets: https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/gene_set_analysis/mh.all.v2022.1.Mm.symbols.gmt
  observations: SRP254919.samplesheet.sample_metadata.tsv
  raw_matrix: SRP254919.salmon.merged.gene_counts.top1000cov.assay.tsv
  normalised_matrix: all.normalised_counts.tsv
  variance_stabilised_matrix: all.vst.tsv
  contrasts_file: SRP254919.contrasts.contrasts_file.tsv
  versions_file: software_versions.yml
  logo: nf-core-differentialabundance_logo_light.png
  css: nf-core_style.css
  citations: CITATIONS.md
  END_PARAMS_SECTION

  # Create output directory
  mkdir artifacts

  # Set parallelism for BLAS/MKL etc. to avoid over-booking of resources
  export MKL_NUM_THREADS="2"
  export OPENBLAS_NUM_THREADS="2"
  export OMP_NUM_THREADS="2"

  # Work around  https://github.com/rstudio/rmarkdown/issues/1508
  # If the symbolic link is not replaced by a physical file
  # output- and temporary files will be written to the original directory.
  mv "differentialabundance_report.Rmd" "differentialabundance_report.Rmd.orig"
  cp -L "differentialabundance_report.Rmd.orig" "SRP254919.Rmd"

  # Render notebook
  Rscript - <<EOF
      params = yaml::read_yaml('.params.yml')

      # Instead of rendering with params, produce a version of the R
      # markdown with param definitions set, so the notebook itself can
      # be reused
      rmd_content <- readLines('SRP254919.Rmd')

      # Extract YAML content between the first two '---'
      start_idx <- which(rmd_content == "---")[1]
      end_idx <- which(rmd_content == "---")[2]
      rmd_yaml_content <- paste(rmd_content[(start_idx+1):(end_idx-1)], collapse = "\n")
      rmd_params <- yaml::yaml.load(rmd_yaml_content)

      # Override the params
      rmd_params[['params']] <- modifyList(rmd_params[['params']], params)

      # Recursive function to add 'value' to list elements, except for top-level
      add_value_recursively <- function(lst, is_top_level = FALSE) {
          if (!is.list(lst)) {
              return(lst)
          }

          lst <- lapply(lst, add_value_recursively)
          if (!is_top_level) {
              lst <- list(value = lst)
          }
          return(lst)
      }

      # Reformat nested lists under 'params' to have a 'value' key recursively
      rmd_params[['params']] <- add_value_recursively(rmd_params[['params']], is_top_level = TRUE)

      # Convert back to YAML string
      updated_yaml_content <- as.character(yaml::as.yaml(rmd_params))

      # Remove the old YAML content
      rmd_content <- rmd_content[-((start_idx+1):(end_idx-1))]

      # Insert the updated YAML content at the right position
      rmd_content <- append(rmd_content, values = unlist(strsplit(updated_yaml_content, split = "\n")), after = start_idx)

      writeLines(rmd_content, 'SRP254919.parameterised.Rmd')

      # Render based on the updated file
      rmarkdown::render('SRP254919.parameterised.Rmd', output_file='SRP254919.html', envir = new.env())
      writeLines(capture.output(sessionInfo()), "session_info.log")
  EOF

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:RMARKDOWNNOTEBOOK":
      rmarkdown: $(Rscript -e "cat(paste(packageVersion('rmarkdown'), collapse='.'))")
  END_VERSIONS

Command exit status:
  1

Command output:
  22/70 [unnamed-chunk-10]
  23/70                   
  24/70 [unnamed-chunk-11]
  25/70                   
  26/70 [unnamed-chunk-12]
  27/70                   
  28/70 [unnamed-chunk-13]
  29/70                   
  30/70 [unnamed-chunk-14]
  31/70                   
  32/70 [unnamed-chunk-15]
  33/70                   
  34/70 [unnamed-chunk-16]
  35/70                   
  36/70 [unnamed-chunk-17]
  37/70                   
  38/70 [unnamed-chunk-18]
  39/70                   
  40/70 [unnamed-chunk-19]
  41/70                   
  42/70 [unnamed-chunk-20]
  43/70                   
  44/70 [unnamed-chunk-21]
  45/70                   
  46/70 [unnamed-chunk-22]
  47/70                   
  48/70 [unnamed-chunk-23]
  49/70                   
  50/70 [unnamed-chunk-24]
  51/70                   
  52/70 [unnamed-chunk-25]
  53/70                   
  54/70 [unnamed-chunk-26]
  55/70                   
  56/70 [unnamed-chunk-27]
  57/70                   
  58/70 [unnamed-chunk-28]
  59/70                   
  60/70 [unnamed-chunk-29]
  61/70                   
  62/70 [unnamed-chunk-30]
  63/70                   
  64/70 [unnamed-chunk-31]
  65/70                   
  66/70 [unnamed-chunk-32]
  67/70                   
  68/70 [unnamed-chunk-33]
  69/70                   
  70/70 [unnamed-chunk-34]
  /usr/local/bin/pandoc +RTS -K512m -RTS SRP254919.parameterised.knit.md --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash --output SRP254919.html --lua-filter /usr/local/lib/R/library/rmarkdown/rmarkdown/lua/pagebreak.lua --lua-filter /usr/local/lib/R/library/rmarkdown/rmarkdown/lua/latex-div.lua --self-contained --variable bs3=TRUE --section-divs --table-of-contents --toc-depth 4 --variable toc_float=1 --variable toc_selectors=h1,h2,h3,h4 --variable toc_collapsed=1 --variable toc_smooth_scroll=1 --variable toc_print=1 --template /usr/local/lib/R/library/rmarkdown/rmd/h/default.html --highlight-style pygments --variable theme=bootstrap --mathjax --variable 'mathjax-url=https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --include-in-header /tmp/Rtmp5Yk0o5/rmarkdown-strca58864e30.html 

Command error:
  26/70 [unnamed-chunk-12]
  27/70                   
  28/70 [unnamed-chunk-13]
  29/70                   
  30/70 [unnamed-chunk-14]
  31/70                   
  32/70 [unnamed-chunk-15]
  33/70                   
  34/70 [unnamed-chunk-16]
  35/70                   
  36/70 [unnamed-chunk-17]
  37/70                   
  38/70 [unnamed-chunk-18]
  39/70                   
  40/70 [unnamed-chunk-19]
  41/70                   
  42/70 [unnamed-chunk-20]
  43/70                   
  44/70 [unnamed-chunk-21]
  45/70                   
  46/70 [unnamed-chunk-22]
  47/70                   
  48/70 [unnamed-chunk-23]
  49/70                   
  50/70 [unnamed-chunk-24]
  51/70                   
  52/70 [unnamed-chunk-25]
  53/70                   
  54/70 [unnamed-chunk-26]
  55/70                   
  56/70 [unnamed-chunk-27]
  57/70                   
  58/70 [unnamed-chunk-28]
  59/70                   
  60/70 [unnamed-chunk-29]
  61/70                   
  62/70 [unnamed-chunk-30]
  63/70                   
  64/70 [unnamed-chunk-31]
  65/70                   
  66/70 [unnamed-chunk-32]
  67/70                   
  68/70 [unnamed-chunk-33]
  69/70                   
  70/70 [unnamed-chunk-34]
  output file: SRP254919.parameterised.knit.md

  /usr/local/bin/pandoc +RTS -K512m -RTS SRP254919.parameterised.knit.md --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash --output SRP254919.html --lua-filter /usr/local/lib/R/library/rmarkdown/rmarkdown/lua/pagebreak.lua --lua-filter /usr/local/lib/R/library/rmarkdown/rmarkdown/lua/latex-div.lua --self-contained --variable bs3=TRUE --section-divs --table-of-contents --toc-depth 4 --variable toc_float=1 --variable toc_selectors=h1,h2,h3,h4 --variable toc_collapsed=1 --variable toc_smooth_scroll=1 --variable toc_print=1 --template /usr/local/lib/R/library/rmarkdown/rmd/h/default.html --highlight-style pygments --variable theme=bootstrap --mathjax --variable 'mathjax-url=https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --include-in-header /tmp/Rtmp5Yk0o5/rmarkdown-strca58864e30.html 
  Error: pandoc document conversion failed with error 9
  Execution halted

Work dir:
  /Users/zhangxiaoyan/Documents/MB_data/work/2a/486002eca9878cc99ca5044031e760

Relevant files

nextflow.log

System information

Version: N E X T F L O W version 23.10.0 build 5889 created 15-10-2023 15:07 UTC (16:07 BST) cite doi:10.1038/nbt.3820 http://nextflow.io
Hardware:

                    'c.          xxxMacBook-Pro.local 
                 ,xNMM.          -------------------------------------------- 
               .OMMMMo           OS: macOS 12.3 21E230 arm64 
               OMMM0,            Host: MacBookPro17,1 
     .;loddo:' loolloddol;.      Kernel: 21.4.0 
   cKMMMMMMMMMMNWMMMMMMMMMM0:    Uptime: 19 hours, 21 mins 
 .KMMMMMMMMMMMMMMMMMMMMMMMWd.    Packages: 34 (brew) 
 XMMMMMMMMMMMMMMMMMMMMMMMX.      Shell: zsh 5.8 
;MMMMMMMMMMMMMMMMMMMMMMMM:       Resolution: 1440x900 
:MMMMMMMMMMMMMMMMMMMMMMMM:       DE: Aqua 
.MMMMMMMMMMMMMMMMMMMMMMMMX.      WM: Quartz Compositor 
 kMMMMMMMMMMMMMMMMMMMMMMMMWd.    WM Theme: Blue (Light) 
 .XMMMMMMMMMMMMMMMMMMMMMMMMMMk   Terminal: Apple_Terminal 
  .XMMMMMMMMMMMMMMMMMMMMMMMMK.   Terminal Font: SFMono-Regular 
    kMMMMMMMMMMMMMMMMMMMMMMd     CPU: Apple M1 
     ;KMMMMMMMWXXWMMMMMMMk.      GPU: Apple M1 
       .cooc,.    .,coo:.        Memory: 1921MiB / 16384MiB

Executor: local
Container engine: docker
OS: macOS 12.3 21E230 arm64
nf-core/differentialabundance version: 1.4.0

WackerO commented 9 months ago

Hey Christina, it seems that error indicates insufficient memory. Could you try increasing the memory of the NOTEBOOK process? You can do that by saving the following code to a notebook.config file and then providing this file to the pipeline when calling it like so:

nextflow run nf-core/differentialabundance -r 1.4.0 -profile test,docker --outdir output -resume -c path/to/notebook.config

This is the file content:

process {
    withName: RMARKDOWNNOTEBOOK {
        memory = { 16.GB * task.attempt}
    }
}

I suppose this is limited by the memory that is available on your computer which I think is around 17GB? If this trick does not work, I'm not sure if the dataset you are trying to analyze might not simply be too big to be run on your mac...

pinin4fjords commented 8 months ago

@WackerO is correct- memory will be the issue here, though pandoc is unhelpfully cryptic. See similar issues like this one.

Closing the issue for now, feel free to reopen if increasing resources doesn't help.

ehuang172 commented 1 month ago

Hello! I am having the same error as above but with a slight difference: Fontconfig error: No writable cache directories

I still end up with the same "Error: pandoc document conversion failed with error 9"

I'm running on an Amazon EC2 instance (m4.4xlarge, 16 vCPU 64 GiB memory) so memory doesn't seem to be the issue.

pinin4fjords commented 1 month ago

(Above discussed on Slack, it was still a memory error)

nf-core / differentialabundance