epi2me-labs / wf-artic

ARTIC SARS-CoV-2 workflow and reporting
https://labs.epi2me.io/
Other
49 stars 36 forks source link

[Bug]: Error executing process > 'pipeline:report (1)' when using sample_sheet #83

Closed davidbante closed 1 year ago

davidbante commented 1 year ago

What happened?

When providing a sample_sheet to wf-artic, pipeline:report terminates with an error. All files are nonetheless written to the output directory as expected, except for the HTML report. After omitting the sample_sheet, the workflow finishes just fine.

Initially I thought it might be related to issue #79 , but the error still persists.

Command: nextflow run epi2me-labs/wf-artic --fastq 01_minknowguppy6.4.6_fastq/fastq_pass/ --scheme_version Midnight-ONT/V3 --out_dir 02_artic-out --artic_threads 12 --pangolin_threads 12 --update_data -c ~/seqresults/nextflow_config.cfg --sample_sheet ../sc2_samplesheets/20230508_samplesheet.csv

For more: 20230531_nextflow_samplesheet_fail.txt

Operating System

ubuntu 20.04

Workflow Execution

Command line

Workflow Execution - EPI2ME Labs Versions

No response

Workflow Execution - CLI Execution Profile

Docker

Workflow Version

v0.3.28-g428300f

Relevant log output

ERROR ~ Error executing process > 'pipeline:report (1)'

Caused by:
  Process `pipeline:report (1)` terminated with an error exit status (1)

Command executed:

  echo "--pangolin pangolin.csv"
      echo "--nextclade nextclade.json"
      echo '[
      {
          "barcode": "barcode66",
          "type": "test_sample",
          "alias": "55267002"
      },
      {
          "barcode": "barcode63",
          "type": "test_sample",
          "alias": "41417806"
      },
      {
          "barcode": "barcode58",
          "type": "test_sample",
          "alias": "42219738"
      },
      {
          "barcode": "barcode65",
          "type": "test_sample",
          "alias": "70105391"
      },
      {
          "barcode": "barcode85",
          "type": "test_sample",
          "alias": "G43-1_170223"
      },
      {
          "barcode": "barcode71",
          "type": "test_sample",
          "alias": "D14-2_210521"
      },
      {
          "barcode": "barcode91",
          "type": "test_sample",
          "alias": "G57-3_280323"
      },
      {
          "barcode": "barcode59",
          "type": "test_sample",
          "alias": "40922561"
      },
      {
          "barcode": "barcode95",
          "type": "test_sample",
          "alias": "F73-3_091222"
      },
      {
          "barcode": "barcode89",
          "type": "test_sample",
          "alias": "G57-1_100323"
      },
      {
          "barcode": "barcode60",
          "type": "test_sample",
          "alias": "43035961"
      },
      {
          "barcode": "barcode81",
          "type": "test_sample",
          "alias": "G15-2_020123"
      },
      {
          "barcode": "barcode61",
          "type": "test_sample",
          "alias": "55930808"
      },
      {
          "barcode": "barcode68",
          "type": "test_sample",
          "alias": "PER400064671"
      },
      {
          "barcode": "barcode57",
          "type": "test_sample",
          "alias": "40944667"
      },
      {
          "barcode": "barcode83",
          "type": "test_sample",
          "alias": "G37-3_140223"
      },
      {
          "barcode": "barcode64",
          "type": "test_sample",
          "alias": "41775697"
      },
      {
          "barcode": "barcode70",
          "type": "test_sample",
          "alias": "H23-09370T"
      },
      {
          "barcode": "barcode62",
          "type": "test_sample",
          "alias": "PER400116966"
      },
      {
          "barcode": "barcode90",
          "type": "test_sample",
          "alias": "G57-2_100323"
      },
      {
          "barcode": "barcode93",
          "type": "test_sample",
          "alias": "G60-1_060323"
      },
      {
          "barcode": "barcode72",
          "type": "test_sample",
          "alias": "E73-1_200522"
      },
      {
          "barcode": "barcode84",
          "type": "test_sample",
          "alias": "G41-2_210223"
      },
      {
          "barcode": "barcode82",
          "type": "test_sample",
          "alias": "G15-3_020123"
      },
      {
          "barcode": "barcode86",
          "type": "test_sample",
          "alias": "G52-1_100323"
      },
      {
          "barcode": "barcode67",
          "type": "test_sample",
          "alias": "PER400273007"
      },
      {
          "barcode": "barcode94",
          "type": "test_sample",
          "alias": "G64-1_030423"
      },
      {
          "barcode": "barcode87",
          "type": "test_sample",
          "alias": "G52-2_180323"
      },
      {
          "barcode": "barcode92",
          "type": "test_sample",
          "alias": "G60_OM"
      },
      {
          "barcode": "barcode69",
          "type": "test_sample",
          "alias": "PER400106437"
      },
      {
          "barcode": "barcode73",
          "type": "test_sample",
          "alias": "F96-2_131222"
      },
      {
          "barcode": "barcode88",
          "type": "test_sample",
          "alias": "G53-1_060323"
      }
  ]' > metadata.json
      workflow-glue report         consensus_status.txt wf-artic-report.html         --pangolin pangolin.csv           --nextclade nextclade.json         --nextclade_errors consensus.errors.csv         --revision master         --commit 428300f74d76e4af18048799db37beb754ec6475         --min_len 150         --max_len 1200         --report_depth 100         --depths depth_stats/*         --fastcat_stats per-read-stats.tsv         --bcftools_stats vcf_stats/*          --versions versions         --params params.json         --consensus_fasta consensus_fasta         --metadata metadata.json

Command exit status:
  1

Command output:
  --pangolin pangolin.csv
  --nextclade nextclade.json

Command error:
  --pangolin pangolin.csv
  --nextclade nextclade.json
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:121: DeprecationWarning: pkg_resources is deprecated as an API
    warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)
  [11:38:00 - workflow_glue] Starting entrypoint.
  /home/nanouser/.nextflow/assets/epi2me-labs/wf-artic/bin/workflow_glue/report.py:96: DtypeWarning: Columns (2) have mixed types. Specify dtype option on import or set low_memory=False.
    seq_summary = pd.read_csv(args.fastcat_stats, delimiter="\t")
  Traceback (most recent call last):
    File "/home/nanouser/.nextflow/assets/epi2me-labs/wf-artic/bin/workflow-glue", line 7, in <module>
mattdmem commented 1 year ago

Thanks for this @davidbante,

I think this is due to having a mixture of sample names that are integers and strings e.g 40944667 and G37-3_140223 - Pandas is reporting that this is resulting in a column with mixed data types.

We'll fix this, but in the meantime please add something to the integer sample names like sample_40944667 and see if that solves your problem.

Thanks

Matt

davidbante commented 1 year ago

Hi Matt,

It used to also work with mixed integer and string sample names in the past. Now, all string sample names work. Thanks for your help!

David

mattdmem commented 1 year ago

Thanks @davidbante,

We'll fix! Thanks again... I'll close for now