epi2me-labs / wf-clone-validation

Other
23 stars 18 forks source link

Error generating report (workflow-glue report) #33

Closed gabyrech closed 8 months ago

gabyrech commented 8 months ago

Operating System

CentOS 7

Other Linux

No response

Workflow Version

v0.5.2

Workflow Execution

Command line

EPI2ME Version

No response

CLI command run

$ nextflow run epi2me-labs/wf-clone-validation -revision prerelease -profile singularity --fastq 'fastq/run1' --basecaller_cfg 'dna_r9.4.1_e8_hac@v3.3' --sample_sheet 'sample_sheets/run1.txt' --out_dir 'results_run1'

Workflow Execution - CLI Execution Profile

singularity

What happened?

Running the command above on a ONT run containing 9 barcodes: ..fastq$ tree -L 2 . └── run1 ├── barcode01 ├── barcode02 ├── barcode03 ├── barcode04 ├── barcode05 ├── barcode06 ├── barcode07 ├── barcode08 └── barcode09 Assenblies were generated for all the nine (I do see the 9 *.final.fasta files in the output directory). But then I get the error below during the report generation.

Note: Running similar command on the _testdata worked ok (a.k.a finished all processes):

$ nextflow run epi2me-labs/wf-clone-validation -revision prerelease -profile singularity --fastq 'wf-clone-validation/test_data/test' --out_dir 'wf-clone-validation_output_prerelease_v0.5.2'

Relevant log output

N E X T F L O W  ~  version 22.04.3
Launching `https://github.com/epi2me-labs/wf-clone-validation` [mad_hopper] DSL2 - revision: fa1932bd5d [prerelease]

||||||||||   _____ ____ ___ ____  __  __ _____      _       _
||||||||||  | ____|  _ \_ _|___ \|  \/  | ____|    | | __ _| |__  ___
|||||       |  _| | |_) | |  __) | |\/| |  _| _____| |/ _` | '_ \/ __|
|||||       | |___|  __/| | / __/| |  | | |__|_____| | (_| | |_) \__ \
||||||||||  |_____|_|  |___|_____|_|  |_|_____|    |_|\__,_|_.__/|___/
||||||||||  wf-clone-validation v0.5.2-gfa1932b
--------------------------------------------------------------------------------
Core Nextflow options
  revision       : prerelease
  runName        : mad_hopper
  containerEngine: singularity
  launchDir      : /xxx/2/xxx/rechg/projects/plasmids/wf-clone-validation
  workDir        : /xxx/2/xxx/rechg/projects/plasmids/wf-clone-validation/work
  projectDir     : /home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation
  userName       : rechg
  profile        : singularity
  configFiles    : /home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/nextflow.config

Input Options
  fastq          : /xxx/2/xxx/rechg/projects/plasmids/wf-clone-validation/fastq/run1
  basecaller_cfg : dna_r9.4.1_e8_hac@v3.3

Sample Options
  sample_sheet   : /xxx/2/xxx/rechg/projects/plasmids/wf-clone-validation/sample_sheets/run1.txt

Output Options
  out_dir        : results_run1

!! Only displaying parameters that differ from the pipeline defaults !!
--------------------------------------------------------------------------------
If you use epi2me-labs/wf-clone-validation for your analysis please cite:

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

--------------------------------------------------------------------------------
This is epi2me-labs/wf-clone-validation v0.5.2-gfa1932b.
--------------------------------------------------------------------------------
WARN: Nextflow version 22.04.3 does not match workflow required version: >=23.04.2 -- Execution will continue, but things may break!
Checking fastq input.
executor >  local (2)
[58/b5dc90] process > validate_sample_sheet             [100%] 1 of 1, cached: 1 ✔
[03/101738] process > fastcat (9)                       [100%] 9 of 9, cached: 9 ✔
[0d/f26916] process > pipeline:checkIfEnoughReads (4)   [100%] 9 of 9, cached: 9 ✔
[c8/67cf37] process > pipeline:assembleCore (9)         [100%] 9 of 9, cached: 9 ✔
[b0/2ceff8] process > pipeline:lookup_medaka_model (1)  [100%] 1 of 1, cached: 1 ✔
[66/b4c0d5] process > pipeline:medakaPolishAssembly (4) [100%] 9 of 9, cached: 9 ✔
[b3/28e686] process > pipeline:downsampledStats (6)     [100%] 9 of 9, cached: 9 ✔
[ec/f07c00] process > pipeline:findPrimers (8)          [100%] 9 of 9, cached: 9 ✔
[d4/62ea76] process > pipeline:medakaVersion            [100%] 1 of 1, cached: 1 ✔
[60/57d5bd] process > pipeline:getVersions              [100%] 1 of 1, cached: 1 ✔
[82/1696f8] process > pipeline:getParams                [100%] 1 of 1, cached: 1 ✔
[69/0bfdd4] process > pipeline:inserts                  [100%] 1 of 1, cached: 1 ✔
[bb/9594f0] process > pipeline:assembly_qc (8)          [100%] 9 of 9, cached: 9 ✔
[79/3e2912] process > pipeline:runPlannotate (1)        [100%] 1 of 1 ✔
[21/914996] process > pipeline:assemblyMafs (8)         [100%] 9 of 9, cached: 9 ✔
[7b/73332d] process > pipeline:report (1)               [  0%] 0 of 1
[f0/672298] process > output (8)                        [100%] 9 of 9, cached: 9
Error executing process > 'pipeline:report (1)'

Caused by:
  Process `pipeline:report (1)` terminated with an error exit status (1)

Command executed:

  workflow-glue report      wf-clone-validation-report.html     --downsampled_stats downsampled_stats/*     --revision prerelease     --commit fa1932bd5da6b4c577e9218085a60cf541f62089     --status final_status.csv     --per_barcode_stats per_barcode_stats/*     --host_filter_stats host_filter_stats/*     --params params.json     --versions versions     --plannotate_json plannotate_report.json     --lengths plannotate.json     --inserts_json insert_data.json     --qc_inserts qc_inserts     --assembly_quality assembly_quality/*     --mafs mafs

Command exit status:
  1

Command output:
  (empty)

Command error:
  [12:07:44 - workflow_glue] Starting entrypoint.
  Traceback (most recent call last):
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow-glue", line 7, in <module>
      cli()
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/__init__.py", line 72, in cli
      args.func(args)
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/report.py", line 140, in main
      report_utils.read_count_barplot(args.per_barcode_stats, report)
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/report_utils/report_utils.py", line 53, in read_count_barplot
      pd.DataFrame(seq_summary['sample_name'].value_counts())
executor >  local (2)
[58/b5dc90] process > validate_sample_sheet             [100%] 1 of 1, cached: 1 ✔
[03/101738] process > fastcat (9)                       [100%] 9 of 9, cached: 9 ✔
[0d/f26916] process > pipeline:checkIfEnoughReads (4)   [100%] 9 of 9, cached: 9 ✔
[c8/67cf37] process > pipeline:assembleCore (9)         [100%] 9 of 9, cached: 9 ✔
[b0/2ceff8] process > pipeline:lookup_medaka_model (1)  [100%] 1 of 1, cached: 1 ✔
[66/b4c0d5] process > pipeline:medakaPolishAssembly (4) [100%] 9 of 9, cached: 9 ✔
[b3/28e686] process > pipeline:downsampledStats (6)     [100%] 9 of 9, cached: 9 ✔
[ec/f07c00] process > pipeline:findPrimers (8)          [100%] 9 of 9, cached: 9 ✔
[d4/62ea76] process > pipeline:medakaVersion            [100%] 1 of 1, cached: 1 ✔
[60/57d5bd] process > pipeline:getVersions              [100%] 1 of 1, cached: 1 ✔
[82/1696f8] process > pipeline:getParams                [100%] 1 of 1, cached: 1 ✔
[69/0bfdd4] process > pipeline:inserts                  [100%] 1 of 1, cached: 1 ✔
[bb/9594f0] process > pipeline:assembly_qc (8)          [100%] 9 of 9, cached: 9 ✔
[79/3e2912] process > pipeline:runPlannotate (1)        [100%] 1 of 1 ✔
[21/914996] process > pipeline:assemblyMafs (8)         [100%] 9 of 9, cached: 9 ✔
[7b/73332d] process > pipeline:report (1)               [100%] 1 of 1, failed: 1 ✘
[f0/672298] process > output (8)                        [ 39%] 9 of 23, cached: 9
Error executing process > 'pipeline:report (1)'

Caused by:
  Process `pipeline:report (1)` terminated with an error exit status (1)

Command executed:

  workflow-glue report      wf-clone-validation-report.html     --downsampled_stats downsampled_stats/*     --revision prerelease     --commit fa1932bd5da6b4c577e9218085a60cf541f62089     --status final_status.csv     --per_barcode_stats per_barcode_stats/*     --host_filter_stats host_filter_stats/*     --params params.json     --versions versions     --plannotate_json plannotate_report.json     --lengths plannotate.json     --inserts_json insert_data.json     --qc_inserts qc_inserts     --assembly_quality assembly_quality/*     --mafs mafs

Command exit status:
  1

Command output:
  (empty)

Command error:
  [12:07:44 - workflow_glue] Starting entrypoint.
  Traceback (most recent call last):
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow-glue", line 7, in <module>
      cli()
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/__init__.py", line 72, in cli
      args.func(args)
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/report.py", line 140, in main
      report_utils.read_count_barplot(args.per_barcode_stats, report)
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/report_utils/report_utils.py", line 53, in read_count_barplot
      pd.DataFrame(seq_summary['sample_name'].value_counts())
executor >  local (2)
[58/b5dc90] process > validate_sample_sheet             [100%] 1 of 1, cached: 1 ✔
[03/101738] process > fastcat (9)                       [100%] 9 of 9, cached: 9 ✔
[0d/f26916] process > pipeline:checkIfEnoughReads (4)   [100%] 9 of 9, cached: 9 ✔
[c8/67cf37] process > pipeline:assembleCore (9)         [100%] 9 of 9, cached: 9 ✔
[b0/2ceff8] process > pipeline:lookup_medaka_model (1)  [100%] 1 of 1, cached: 1 ✔
[66/b4c0d5] process > pipeline:medakaPolishAssembly (4) [100%] 9 of 9, cached: 9 ✔
[b3/28e686] process > pipeline:downsampledStats (6)     [100%] 9 of 9, cached: 9 ✔
[ec/f07c00] process > pipeline:findPrimers (8)          [100%] 9 of 9, cached: 9 ✔
[d4/62ea76] process > pipeline:medakaVersion            [100%] 1 of 1, cached: 1 ✔
[60/57d5bd] process > pipeline:getVersions              [100%] 1 of 1, cached: 1 ✔
[82/1696f8] process > pipeline:getParams                [100%] 1 of 1, cached: 1 ✔
[69/0bfdd4] process > pipeline:inserts                  [100%] 1 of 1, cached: 1 ✔
[bb/9594f0] process > pipeline:assembly_qc (8)          [100%] 9 of 9, cached: 9 ✔
[79/3e2912] process > pipeline:runPlannotate (1)        [100%] 1 of 1 ✔
[21/914996] process > pipeline:assemblyMafs (8)         [100%] 9 of 9, cached: 9 ✔
[7b/73332d] process > pipeline:report (1)               [100%] 1 of 1, failed: 1 ✘
[f0/672298] process > output (8)                        [ 39%] 9 of 23, cached: 9
Error executing process > 'pipeline:report (1)'

Caused by:
  Process `pipeline:report (1)` terminated with an error exit status (1)

Command executed:

  workflow-glue report      wf-clone-validation-report.html     --downsampled_stats downsampled_stats/*     --revision prerelease     --commit fa1932bd5da6b4c577e9218085a60cf541f62089     --status final_status.csv     --per_barcode_stats per_barcode_stats/*     --host_filter_stats host_filter_stats/*     --params params.json     --versions versions     --plannotate_json plannotate_report.json     --lengths plannotate.json     --inserts_json insert_data.json     --qc_inserts qc_inserts     --assembly_quality assembly_quality/*     --mafs mafs

Command exit status:
  1

Command output:
  (empty)

Command error:
  [12:07:44 - workflow_glue] Starting entrypoint.
  Traceback (most recent call last):
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow-glue", line 7, in <module>
      cli()
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/__init__.py", line 72, in cli
      args.func(args)
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/report.py", line 140, in main
      report_utils.read_count_barplot(args.per_barcode_stats, report)
    File "/home/rechg/.nextflow/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/report_utils/report_utils.py", line 53, in read_count_barplot
      pd.DataFrame(seq_summary['sample_name'].value_counts())
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/util/_decorators.py", line 311, in wrapper
      return func(*args, **kwargs)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/frame.py", line 6393, in sort_index
      return super().sort_index(
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/generic.py", line 4544, in sort_index
      indexer = get_indexer_indexer(
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/sorting.py", line 91, in get_indexer_indexer
      indexer = nargsort(
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/sorting.py", line 391, in nargsort
      return items.argsort(ascending=ascending, kind=kind, na_position=na_position)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/arrays/base.py", line 628, in argsort
      return nargsort(
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/sorting.py", line 403, in nargsort
      indexer = non_nan_idx[non_nans.argsort(kind=kind)]
  TypeError: '<' not supported between instances of 'int' and 'str'

Work dir:
  /xxx/2/xxx/rechg/projects/plasmids/wf-clone-validation/work/7b/73332d1dd439db6f68b1fefa4c7b25

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

Application activity log entry

No response

sarahjeeeze commented 8 months ago

Hi, thanks for reporting this, we will see if we can reproduce.

sarahjeeeze commented 8 months ago

Hi, would you be able to update your nextflow version to >=23.04.2 and try to run the workflow again?

gabyrech commented 8 months ago

Hi, just run it with version 23.04.4 and got the same error... I am attaching the .nextflow.log file FYI... thanks! nextflow.log

sarahjeeeze commented 8 months ago

Hi, thanks for that. I am just trying to recreate the error. Would you mind sharing you sample sheet with me if you are using one?

gabyrech commented 8 months ago

Sure, here you go: sample_sheets_run1.txt

sarahjeeeze commented 8 months ago

Hi, great thanks. I was able to recreate your error. We will add a fix to the workflow shortly with handling for Aliases that could be all numbers. Sorry for that! In the mean time the workflow should complete if you were to rename barcode03 alias with a name that included a letter but we will add a fix soon.

gabyrech commented 8 months ago

Thanks, @sarahjeeeze ! renaming barcode03 to something starting with a letter fixed the issue. Now the workflow finished completely. Thanks!