nf-core / oncoanalyser

A comprehensive cancer DNA/RNA analysis and reporting pipeline
https://nf-co.re/oncoanalyser
MIT License
40 stars 6 forks source link

`Rscript` error when running `PURPLE` #100

Open bounlu opened 1 week ago

bounlu commented 1 week ago

Purple fails with the below error:

ERROR ~ Error executing process > 'NFCORE_ONCOANALYSER:TARGETED:PURPLE_CALLING:PURPLE (220024755)'

Caused by:
  Process `NFCORE_ONCOANALYSER:TARGETED:PURPLE_CALLING:PURPLE (220024755)` terminated with an error exit status (1)

Command executed:

  purple \
      -Xmx12240656794 \
       \
      -tumor 220024755 \
       \
      -amber amber \
      -cobalt cobalt \
      -somatic_sv_vcf 220024755.gripss.filtered.vcf.gz \
       \
      -sv_recovery_vcf 220024755.gripss.vcf.gz \
      -somatic_vcf 220024755.sage.somatic.pave.vcf.gz \
       \
      -ref_genome GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
      -ref_genome_version 38 \
      -driver_gene_panel DriverGenePanel.tso500.38.tsv \
      -ensembl_data_dir ensembl_data \
      -somatic_hotspots KnownHotspots.somatic.38.vcf.gz \
       \
      -target_regions_bed target_regions_definition.tso500.38.bed.gz \
      -target_regions_ratios target_regions_ratios.tso500.38.tsv \
      -target_regions_msi_indels target_regions_msi_indels.tso500.38.tsv \
       \
      -gc_profile GC_profile.1000bp.38.cnp \
      -circos $(which circos) \
      -threads 2 \
      -output_dir purple/

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_ONCOANALYSER:TARGETED:PURPLE_CALLING:PURPLE":
      purple: $(purple -version | sed 's/^.* //')
  END_VERSIONS

Command exit status:
  1

Command output:
  07:55:33.685 [INFO ] Purple version 4.0.2
  07:55:33.732 [INFO ] output directory: purple/
  07:55:33.736 [INFO ] reference(NONE) tumor(220024755) running on target-regions only
  07:55:34.069 [INFO ] using ref genome: V38
  07:55:36.470 [INFO ] loaded 7282 target regions bases(total=1314771 coding=1314771) from file(target_regions_definition.tso500.38.bed.gz)
  07:55:36.480 [INFO ] loaded 130 MSI INDELs from file(target_regions_msi_indels.tso500.38.tsv)
  07:55:36.488 [INFO ] target regions: tml(1.0) tmb(0.05) msiIndels(220.0) msiAF(2-3 base=0.15 4 base=0.08) codingBaseFactor(150000)
  07:55:36.490 [INFO ] reading GC Profiles from GC_profile.1000bp.38.cnp
  07:55:38.647 [INFO ] reading Amber QC from amber/220024755.amber.qc
  07:55:38.659 [INFO ] reading Amber BAFs from amber/220024755.amber.baf.tsv.gz
  07:55:38.665 [INFO ] reading Amber PCFs from amber/220024755.amber.baf.pcf
  07:55:38.861 [INFO ] Amber average tumor depth(2659) ambiguous BAF(0.508)
  07:55:38.862 [INFO ] reading Cobalt tumor segments from cobalt/220024755.cobalt.ratio.pcf
  07:55:38.874 [INFO ] reading Cobalt ratios from cobalt/220024755.cobalt.ratio.tsv.gz
  07:55:42.376 [INFO ] loaded 15 somatic SVs from 220024755.gripss.filtered.vcf.gz
  07:55:42.747 [INFO ] loaded 11711 somatic variants from 220024755.sage.somatic.pave.vcf.gz
  07:55:42.747 [INFO ] sample gender is male
  07:55:42.747 [INFO ] applying segmentation
  07:55:42.748 [INFO ] merging reference and tumor ratio break points
  07:55:44.345 [INFO ] purple output directory: purple/
  07:55:44.435 [INFO ] fitting purity
  07:55:46.484 [INFO ] maxDiploidProportion(0.273) diploidCandidates(93) purityRange(0.830 - 1.000) hasTumor(true)
  07:55:46.499 [INFO ] calculating copy number
  07:55:46.575 [INFO ] loading recovery candidates from 220024755.gripss.vcf.gz
  07:55:46.937 [INFO ] reapplying segmentation with 4 recovered structural variants
  07:55:46.939 [INFO ] merging reference and tumor ratio break points
  07:55:48.530 [INFO ] recalculating copy number
  07:55:48.643 [INFO ] modelling somatic peaks
  07:55:48.873 [INFO ] enriching somatic variants
  07:55:49.218 [INFO ] load(8.0 tml=165.3243) msiIndels(0 perMb=0.0000) burden(6.0 perMb=8.2662)
  07:55:49.660 [INFO ] generating QC Stats
  07:55:49.699 [INFO ] generating charts
  07:55:49.720 [INFO ] Generating 220024755.circos.png via command: /usr/local/bin/circos -nosvg -conf purple/circos/220024755.circos.conf -outputdir purple/plot -outputfile 220024755.circos.png
  07:55:49.720 [INFO ] Generating 220024755.input.png via command: /usr/local/bin/circos -nosvg -conf purple/circos/220024755.input.conf -outputdir purple/plot -outputfile 220024755.input.png
  07:56:01.716 [FATAL] Error executing R script.
  07:56:02.819 [WARN ] error generating charts
  07:56:02.819 [ERROR] charting error: java.lang.Exception: charting failed

Command error:
  Backtrace:
       x
    1. \-global clonality_plot(somaticBuckets, clonalityModel)
    2.   \-cowplot::plot_grid(...)
    3.     \-cowplot::align_plots(...)
    4.       \-base::lapply(...)
    5.         \-cowplot (local) FUN(X[[i]], ...)
    6.           +-cowplot::as_gtable(x)
    7.           \-cowplot:::as_gtable.default(x)
    8.             +-cowplot::as_grob(plot)
    9.             \-cowplot:::as_grob.ggplot(plot)
   10.               \-ggplot2::ggplotGrob(plot)
   11.                 +-ggplot2::ggplot_gtable(ggplot_build(x))
   12.                 | \-ggplot2:::attach_plot_env(data$plot$plot_env)
   13.                 |   \-base::options(ggplot2_plot_env = env)
   14.                 +-ggplot2::ggplot_build(x)
   15.                 \-ggplot2:::ggplot_build.ggplot(x)
   16.                   \-layout$train_position(data, scale_x(), scale_y())
   17.                     \-ggplot2 (local) train_position(..., self = self)
   18.                       \-self$facet$train_scales(...)
   19.                         \-ggplot2 (local) train_scales(...)
   20.                           \-ggplot2:::scale_apply(layer_data, y_vars, "train", SCALE_Y, y_scales)
   21.                             \-base::lapply(...)
   22.                               \-ggplot2 (local) FUN(X[[i]], ...)
   23.                                 \-base::lapply(...)
   24.                                   \-ggplot2 (local) FUN(X[[i]], ...)
   25.                                     \-scales[[i]][[method]](data[[var]][scale_index[[i]]])
   26.                                       \-ggplot2 (local) train(..., self = self)
   27.                                         \-cli::cli_abort(...)
   28.                                           \-rlang::abort(...)
  Warning messages:
  1: Removed 1 row containing non-finite outside the scale range (`stat_count()`). 
  2: Removed 201 rows containing non-finite outside the scale range
  (`stat_align()`). 
  3: Removed 1 row containing missing values or values outside the scale range
  (`geom_bar()`). 
  4: Removed 201 rows containing missing values or values outside the scale range
  (`geom_line()`). 
  5: Removed 591 rows containing missing values or values outside the scale range
  (`geom_line()`). 
  Execution halted
  sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
  07:56:01.716 [FATAL] Error executing R script.
  07:56:02.819 [WARN ] error generating charts
  07:56:02.819 [ERROR] charting error: java.lang.Exception: charting failed
  java.lang.Exception: charting failed
    at com.hartwig.hmftools.purple.plot.Charts.write(Charts.java:61)
    at com.hartwig.hmftools.purple.PurpleApplication.performFit(PurpleApplication.java:399)
    at com.hartwig.hmftools.purple.PurpleApplication.run(PurpleApplication.java:157)
    at com.hartwig.hmftools.purple.PurpleApplication.main(PurpleApplication.java:618)

Command used and terminal output

No response

Relevant files

No response

System information

No response

bounlu commented 6 days ago

When I use --no_charts flag for purple, it succeeds without generating the plots. However, this this time orange fails as the expected purple_plot_dir does not exist:

ERROR ~ Error executing process > 'NFCORE_ONCOANALYSER:TARGETED:ORANGE_REPORTING:ORANGE (220024755)'

Caused by:
  Process `NFCORE_ONCOANALYSER:TARGETED:ORANGE_REPORTING:ORANGE (220024755)` terminated with an error exit status (1)

Command executed:

  echo "5.34 [oncoanalyser]" > pipeline_version.txt

  # When WTS data is present, ORANGE expects the somatic SAGE VCF to have appended WTS data; CS indicates this should
  # occur after PURPLE. Since ORANGE only collects the somatic SAGE VCF from the PURPLE output directory, we must
  # prepare accordingly

  # Isofox inputs are also expected to have the tumor sample ID in the filename

  # NOTES(SW): Use of symlinks was causing reliability issues on HPC with Singularity, switched to full file copy instead

  purple_dir_local=purple
  if [[ -n "" ]]; then

      purple_dir_local=purple__prepared;

      if [[ -d ${purple_dir_local}/ ]]; then
          rm -r ${purple_dir_local}/;
      fi

      cp -rL purple ${purple_dir_local}/
      cp -L  ${purple_dir_local}/220024755.purple.somatic.vcf.gz;

      if [[ -n "" ]]; then
          cp -L  ${purple_dir_local}/220024755.purple.germline.vcf.gz;
      fi;

      mkdir -p isofox_dir__prepared/;
      for fp in /*; do
          cp -L ${fp} isofox_dir__prepared/$(sed 's/null/220024755/' <<< ${fp##*/});
      done;

  fi

  # Set input plot directory and create it doesn't exist. See the LINX visualiser module for further info.
  if [[ ! -e plots/reportable/ ]]; then
      mkdir -p plots/reportable/;
  fi;

  # NOTE(SW): '--add-opens java.base/java.time=ALL-UNNAMED' resolves issue writing JSON, see:
  # https://stackoverflow.com/questions/70412805/what-does-this-error-mean-java-lang-reflect-inaccessibleobjectexception-unable/70878195#70878195

  # NOTE(SW): DOID label: 162 [cancer]; Hartwig cohort group: unknown

  mkdir -p output/

  # NOTE(SW): manually locating ORANGE install directory so that we can applu `--add-opens`, won't fix old bioconda recipe
  orange_bin_fp=$(which orange)
  orange_install_dir=$(readlink ${orange_bin_fp} | xargs dirname)
  orange_jar=$(dirname ${orange_bin_fp})/${orange_install_dir}/orange.jar

  java \
      --add-opens java.base/java.time=ALL-UNNAMED \
      -Xmx6120328397 \
      -jar ${orange_jar} \
           \
          \
          -experiment_date $(date +%y%m%d) \
          -add_disclaimer \
          -pipeline_version_file pipeline_version.txt \
          \
          -tumor_sample_id 220024755 \
          -primary_tumor_doids 162 \
          -tumor_sample_wgs_metrics_file 220024755.wgsmetrics \
          -tumor_sample_flagstat_file 220024755.flagstat \
          -sage_dir somatic \
          -purple_dir ${purple_dir_local} \
          -purple_plot_dir ${purple_dir_local}/plot/ \
          -linx_dir linx_somatic \
          -linx_plot_dir plots/reportable/ \
          -lilac_dir lilac \
           \
           \
           \
           \
          \
           \
           \
           \
           \
           \
          \
           \
           \
          \
          -ref_genome_version 38 \
          -doid_json doid.json \
          -cohort_mapping_tsv cohort_mapping.tsv \
          -cohort_percentiles_tsv cohort_percentiles.tsv \
          -known_fusion_file known_fusion_data.38.csv \
          -driver_gene_panel DriverGenePanel.tso500.38.tsv \
          -ensembl_data_dir ensembl_data \
           \
           \
          -output_dir output/

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_ONCOANALYSER:TARGETED:ORANGE_REPORTING:ORANGE":
      orange: $(orange -version | sed 's/^.* //')
  END_VERSIONS

Command exit status:
  1

Command output:
  09:21:47 - [ERROR] - invalid path for config: purple_plot_dir = purple/plot/