count_summary.Rmd stalls at codechunk-10

ute-hoffmann commented 7 months ago

Description of the bug

Pipeline stalls at count_summary.Rmd, more specifically at codechunk-10 (Number of barcodes per gene, per sample). Input used does not use fitness calculations / does not assign genes to different investigated "sgRNAs" (in this case barcodes). I assume that this breaks "group_by(sample, Gene)? Previously, the pipeline worked on almost exactly this data set, so I am not sure what might be the actual reason.

Command used and terminal output

nextflow run ../../../pipelines/nf-core-crispriscreen/ -profile singularity --input "input/samplesheet_Rubisco.csv" --fasta "input/2023-05-23_Rubisco_barcodes.fa" --outdir "output" --five_prime_adapter GTCTAGAatcgccgaaagtaattcaactccattaa...TCTAGATGCTTACTAGTTACCGCGGCCA --error_rate 0.2 --filter_mapq=1 --max_cpus 5 --max_memory 12GB --run_mageck false --gene_fitness false -resume

(starting at relevant part of output, skipping the nf-core options etc.):
-[nf-core/crispriscreen] Pipeline completed with errors-
Error executing process > 'NFCORE_CRISPRISCREEN:CRISPRISCREEN:RMARKDOWNNOTEBOOK (counts_summary)'

Caused by:
  Process `NFCORE_CRISPRISCREEN:CRISPRISCREEN:RMARKDOWNNOTEBOOK (counts_summary)` terminated with an error exit status (1)

Command executed:

  # Dump .params.yml heredoc (section will be empty if parametrization is disabled)
  cat <<"END_PARAMS_SECTION" > ./.params.yml
  cpus: 2
  artifact_dir: artifacts
  input_dir: ./
  meta:
    id: counts_summary
  END_PARAMS_SECTION

  # Create output directory
  mkdir artifacts

  # Set parallelism for BLAS/MKL etc. to avoid over-booking of resources
  export MKL_NUM_THREADS="2"
  export OPENBLAS_NUM_THREADS="2"
  export OMP_NUM_THREADS="2"

  # Work around  https://github.com/rstudio/rmarkdown/issues/1508
  # If the symbolic link is not replaced by a physical file
  # output- and temporary files will be written to the original directory.
  mv "counts_summary.Rmd" "counts_summary.Rmd.orig"
  cp -L "counts_summary.Rmd.orig" "counts_summary.Rmd"

  # Render notebook
  Rscript - <<EOF
      params = yaml::read_yaml('.params.yml')
      rmarkdown::render('counts_summary.Rmd', params=params, envir=new.env())
      writeLines(capture.output(sessionInfo()), "session_info.log")
  EOF

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_CRISPRISCREEN:CRISPRISCREEN:RMARKDOWNNOTEBOOK":
      rmarkdown: $(Rscript -e "cat(paste(packageVersion('rmarkdown'), collapse='.'))")
  END_VERSIONS

Command exit status:
  1

Command output:
  1/38                   
  2/38 [setup]           
  3/38                   
  4/38 [unnamed-chunk-1] 
  5/38                   
  6/38 [unnamed-chunk-2] 
  7/38                   
  8/38 [unnamed-chunk-3] 
  9/38                   
  10/38 [unnamed-chunk-4] 
  11/38                   
  12/38 [unnamed-chunk-5] 
  13/38                   
  14/38 [unnamed-chunk-6] 
  15/38                   
  16/38 [unnamed-chunk-7] 
  17/38                   
  18/38 [unnamed-chunk-8] 
  19/38                   
  20/38 [unnamed-chunk-9] 
  21/38                   
  22/38 [unnamed-chunk-10]

Command error:

  processing file: counts_summary.Rmd
  1/38                   
  2/38 [setup]           
  3/38                   
  4/38 [unnamed-chunk-1] 
  5/38                   
  6/38 [unnamed-chunk-2] 
  7/38                   
  8/38 [unnamed-chunk-3] 
  9/38                   
  10/38 [unnamed-chunk-4] 
  11/38                   
  12/38 [unnamed-chunk-5] 
  13/38                   
  14/38 [unnamed-chunk-6] 
  15/38                   
  16/38 [unnamed-chunk-7] 
  17/38                   
  18/38 [unnamed-chunk-8] 
  19/38                   
  20/38 [unnamed-chunk-9] 
  21/38                   
  22/38 [unnamed-chunk-10]

  Quitting from lines 192-207 [unnamed-chunk-10] (counts_summary.Rmd)
  Error in `grid.Call.graphics()`:
  ! non-finite location and/or size for viewport
  Backtrace:
    1. rmarkdown::render("counts_summary.Rmd", params = params, envir = new.env())
   38. grid:::drawGTree(x)
   40. grid:::preDraw.gTree(x)
   41. grid:::pushvpgp(x)
   43. grid:::pushgrobvp.viewport(x$vp)
       ...
   45. base::lapply(vps, push.vp, recording)
   47. grid:::push.vp.vpStack(X[[i]], ...)
   48. base::lapply(vp, push.vp, recording)
   50. grid:::push.vp.viewport(X[[i]], ...)
   51. grid:::grid.Call.graphics(C_setviewport, vp, TRUE)
  Execution halted

Relevant files

No response

System information

No response

m-jahn commented 7 months ago

Hey @ute-hoffmann , thanks for reporting. Is there a way to reproduce this bug with toy data? Alternatively can you look into the bug? Maybe a quick fix would be to add missing gene-sgRNA relationship by creating a column with pseudo-genes, but I'm not sure, was just a quick thought.

ute-hoffmann commented 7 months ago

I wanted to try if this error also occurs when running it on the included test data yesterday, but then my computer crashed. Will run it today & keep you updated.

ute-hoffmann commented 7 months ago

Problem is not specific to my data, the pipeline also crashes with the included example data when using this command:

nextflow run ./ -profile singularity --input "assets/samplesheet.csv" --fasta "assets/library.fasta" --outdir "results" --max_cpus 5 --max_memory 12GB --run_mageck false --gene_fitness false

Output:

-[nf-core/crispriscreen] Pipeline completed with errors-
Error executing process > 'NFCORE_CRISPRISCREEN:CRISPRISCREEN:RMARKDOWNNOTEBOOK (counts_summary)'

Caused by:
  Process `NFCORE_CRISPRISCREEN:CRISPRISCREEN:RMARKDOWNNOTEBOOK (counts_summary)` terminated with an error exit status (1)

Command executed:

  # Dump .params.yml heredoc (section will be empty if parametrization is disabled)
  cat <<"END_PARAMS_SECTION" > ./.params.yml
  cpus: 2
  artifact_dir: artifacts
  input_dir: ./
  meta:
    id: counts_summary
  END_PARAMS_SECTION

  # Create output directory
  mkdir artifacts

  # Set parallelism for BLAS/MKL etc. to avoid over-booking of resources
  export MKL_NUM_THREADS="2"
  export OPENBLAS_NUM_THREADS="2"
  export OMP_NUM_THREADS="2"

  # Work around  https://github.com/rstudio/rmarkdown/issues/1508
  # If the symbolic link is not replaced by a physical file
  # output- and temporary files will be written to the original directory.
  mv "counts_summary.Rmd" "counts_summary.Rmd.orig"
  cp -L "counts_summary.Rmd.orig" "counts_summary.Rmd"

  # Render notebook
  Rscript - <<EOF
      params = yaml::read_yaml('.params.yml')
      rmarkdown::render('counts_summary.Rmd', params=params, envir=new.env())
      writeLines(capture.output(sessionInfo()), "session_info.log")
  EOF

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_CRISPRISCREEN:CRISPRISCREEN:RMARKDOWNNOTEBOOK":
      rmarkdown: $(Rscript -e "cat(paste(packageVersion('rmarkdown'), collapse='.'))")
  END_VERSIONS

Command exit status:
  1

Command output:
  1/38                   
  2/38 [setup]           
  3/38                   
  4/38 [unnamed-chunk-1] 
  5/38                   
  6/38 [unnamed-chunk-2] 
  7/38                   
  8/38 [unnamed-chunk-3] 
  9/38                   
  10/38 [unnamed-chunk-4] 
  11/38                   
  12/38 [unnamed-chunk-5] 
  13/38                   
  14/38 [unnamed-chunk-6] 
  15/38                   
  16/38 [unnamed-chunk-7] 
  17/38                   
  18/38 [unnamed-chunk-8] 
  19/38                   
  20/38 [unnamed-chunk-9] 
  21/38                   
  22/38 [unnamed-chunk-10]

Command error:

  processing file: counts_summary.Rmd
  1/38                   
  2/38 [setup]           
  3/38                   
  4/38 [unnamed-chunk-1] 
  5/38                   
  6/38 [unnamed-chunk-2] 
  7/38                   
  8/38 [unnamed-chunk-3] 
  9/38                   
  10/38 [unnamed-chunk-4] 
  11/38                   
  12/38 [unnamed-chunk-5] 
  13/38                   
  14/38 [unnamed-chunk-6] 
  15/38                   
  16/38 [unnamed-chunk-7] 
  17/38                   
  18/38 [unnamed-chunk-8] 
  19/38                   
  20/38 [unnamed-chunk-9] 
  21/38                   
  22/38 [unnamed-chunk-10]

  Quitting from lines 192-207 [unnamed-chunk-10] (counts_summary.Rmd)
  Error in `grid.Call.graphics()`:
  ! non-finite location and/or size for viewport
  Backtrace:
    1. rmarkdown::render("counts_summary.Rmd", params = params, envir = new.env())
   38. grid:::drawGTree(x)
   40. grid:::preDraw.gTree(x)
   41. grid:::pushvpgp(x)
   43. grid:::pushgrobvp.viewport(x$vp)
       ...
   45. base::lapply(vps, push.vp, recording)
   47. grid:::push.vp.vpStack(X[[i]], ...)
   48. base::lapply(vp, push.vp, recording)
   50. grid:::push.vp.viewport(X[[i]], ...)
   51. grid:::grid.Call.graphics(C_setviewport, vp, TRUE)
  Execution halted

Work dir:
  /home/ute/Documents/Manuscripts/new_nf/nf-core-crispriscreen/work/76/843dc9780b0f24314e40478aaa62ed

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

ute-hoffmann commented 7 months ago

I think I managed to fix everything, compare Pull request

An update in ggplot2 was to blame: From v3.5.0 on (https://ggplot2.tidyverse.org/news/index.html#ggplot2-350), legend.position seems to work differently than before, compare:

Providing a numeric vector to theme(legend.position) has been deprecated. To set the default legend position inside the plot use theme(legend.position = "inside", legend.position.inside = c(...)) instead.

Easy fix --> see commit: change "legend.position=0" to legend.position="none"

Further changes: fig.width = figwidth and fig.height = figheight do not seem to be legit fig.width and fig.height options. Replaced by out.width="100%" and out.height="100%", hoping that this was the planned output.

Further problem in plot for "Sample and replicate similarity with PCA": Rendering with knitr to .html yields following error message:

  Quitting from lines 324-346 [unnamed-chunk-17] (counts_summary.Rmd)
  Error in `plot_theme()`:
  ! The `aspect` theme element is not defined in the element hierarchy.
  Backtrace:
    1. rmarkdown::render("counts_summary.Rmd", params = params, envir = new.env())
    2. knitr::knit(knit_input, knit_output, envir = envir, quiet = quiet)
    3. knitr:::process_file(text, output)
    8. knitr:::process_group.block(group)
    9. knitr:::call_block(x)
       ...
   23. knitr:::knit_print.default(x, ...)
   24. evaluate (local) normal_print(x)
   26. ggplot2:::print.ggplot(x)
   28. ggplot2:::ggplot_gtable.ggplot_built(data)
   29. ggplot2:::plot_theme(plot)
  Execution halted

Fix: Delete "aspect = 1" from code.

m-jahn commented 7 months ago

OK got it. Breaking changes in ggplot2. Actually this was a problem in other pipelines of mine too. I need to pay more attention to deprecation warnings :-)

ute-hoffmann commented 7 months ago

Sorry, overlooked that code still has to be merged into master branch, closed issue too early

m-jahn commented 7 months ago

I will take care of that, it's OK.

MPUSP / nf-core-crispriscreen