epi2me-labs / wf-amplicon

Other
16 stars 5 forks source link

report not generated, error: division by zero #2

Closed warthmann closed 11 months ago

warthmann commented 1 year ago

Operating System

Other Linux (please specify below)

Other Linux

ubuntu 20.04

Workflow Version

v0.2.1

Workflow Execution

EPI2ME Desktop application

EPI2ME Version

v5.0.2

CLI command run

No response

Workflow Execution - CLI Execution Profile

None

What happened?

One target, PCR-amplified from 6 samples, barcoded with BC01-BC06 (from kit PBK004) in a 2nd PCR. Base-called with guppy.

I ran wf-amplicon successfully on the the base-called reads that were not demultiplexed and got a nice html report, and all other outputfiles, bam and vcf, etc, In this run I had sub sampled to 1000 reads.

I then demultiplexed my reads using guppy_barcoder

$ cut -f 7 barcoding_summary.txt | sort | uniq -c 1 barcode_front_id 7684 BC01_FWD 8370 BC02_FWD 5059 BC03_FWD 8996 BC04_FWD 9262 BC05_FWD 9216 BC06_FWD 23 BC07_FWD 27 BC08_FWD 32 BC09_FWD 38 BC10_FWD 48 BC11_FWD 71 BC12_FWD

I then ran wf-amplicon again, this time subsampling to 2000 reads per barcode. It stopped with an error and I am attaching the log. From the "Report" (zip of the html is attached) I can deduce that all steps completed successfully, and the error is thrown at the report generation step. Any attention and help to solve this is greatly appreciated. report.zip

Relevant log output

N E X T F L O W  ~  version 23.04.1
Launching `/home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/main.nf` [naughty_lewin] DSL2 - revision: 12f1ac5b3a
WARN: Found unexpected parameters:
* --process_label: wfamplicon
- Ignore this warning: params.schema_ignore_params = "process_label" 
||||||||||   _____ ____ ___ ____  __  __ _____      _       _
||||||||||  | ____|  _ \_ _|___ \|  \/  | ____|    | | __ _| |__  ___
|||||       |  _| | |_) | |  __) | |\/| |  _| _____| |/ _` | '_ \/ __|
|||||       | |___|  __/| | / __/| |  | | |__|_____| | (_| | |_) \__ \
||||||||||  |_____|_|  |___|_____|_|  |_|_____|    |_|\__,_|_.__/|___/
||||||||||  wf-amplicon v0.2.1
--------------------------------------------------------------------------------
Core Nextflow options
  runName                : naughty_lewin
  containerEngine        : docker
  launchDir              : /home/pbgl/epi2melabs/instances/wf-amplicon_8100fd8a-8046-408b-b785-8720a6dbe836
  workDir                : /home/pbgl/epi2melabs/instances/wf-amplicon_8100fd8a-8046-408b-b785-8720a6dbe836/work
  projectDir             : /home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon
  userName               : pbgl
  profile                : standard
  configFiles            : /home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/nextflow.config
Input Options
  fastq                  : /home/pbgl/MinION-basecalls/Barley-Amplicons/fastq_bascalled_06_07_2022_demultiplexed
  reference              : /home/pbgl/sandbox/analysis-barley-amplicon/reference.fasta
Pre-processing Options
  min_read_length        : 1000
  reads_downsampling_size: 2000
Variant calling options
  basecaller_cfg         : dna_r9.4.1_e8.1_hac
Output Options
  out_dir                : /home/pbgl/epi2melabs/instances/wf-amplicon_8100fd8a-8046-408b-b785-8720a6dbe836/output
!! Only displaying parameters that differ from the pipeline defaults !!
--------------------------------------------------------------------------------
If you use epi2me-labs/wf-amplicon for your analysis please cite:
* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x
--------------------------------------------------------------------------------
This is epi2me-labs/wf-amplicon v0.2.1.
--------------------------------------------------------------------------------
Checking fastq input.
[80/1040cd] Submitted process > fastcat (7)
[09/a944b2] Submitted process > pipeline:getVersions
[72/d8e885] Submitted process > fastcat (2)
[f2/760102] Submitted process > fastcat (4)
[01/ea2f6b] Submitted process > fastcat (6)
[99/58bec8] Submitted process > fastcat (5)
[a1/507502] Submitted process > fastcat (1)
[44/53d42a] Submitted process > pipeline:getParams
[13/b326b5] Submitted process > pipeline:variantCallingPipeline:lookupMedakaVariantModel
[74/8edde1] Submitted process > pipeline:variantCallingPipeline:sanitizeRefFile
[5f/6180a9] Submitted process > fastcat (3)
[1f/4cdc20] Submitted process > pipeline:downsampleReads (1)
[8c/be07b6] Submitted process > pipeline:addMedakaToVersionsFile
[5e/2c656b] Submitted process > pipeline:downsampleReads (2)
[41/a353ea] Submitted process > pipeline:downsampleReads (3)
[83/3a86fb] Submitted process > pipeline:downsampleReads (4)
[0d/9aafb9] Submitted process > pipeline:downsampleReads (5)
[08/868aeb] Submitted process > pipeline:downsampleReads (6)
[3c/2e1eac] Submitted process > pipeline:downsampleReads (7)
[15/627748] Submitted process > pipeline:porechop (1)
[c1/7b436a] Submitted process > pipeline:porechop (2)
[83/bda564] Submitted process > pipeline:porechop (3)
[ac/d6a69e] Submitted process > pipeline:porechop (4)
[80/0fd3e3] Submitted process > pipeline:porechop (5)
[be/16fef3] Submitted process > pipeline:porechop (6)
[41/27dcaa] Submitted process > pipeline:porechop (7)
[22/671899] Submitted process > pipeline:variantCallingPipeline:alignReads (1)
[71/7143e3] Submitted process > pipeline:variantCallingPipeline:alignReads (2)
[94/11783b] Submitted process > pipeline:variantCallingPipeline:alignReads (3)
[5e/547818] Submitted process > pipeline:variantCallingPipeline:alignReads (4)
[27/a5e566] Submitted process > pipeline:variantCallingPipeline:alignReads (5)
[69/a675cc] Submitted process > pipeline:variantCallingPipeline:alignReads (6)
[57/33e3e8] Submitted process > pipeline:variantCallingPipeline:alignReads (7)
[3d/004645] Submitted process > pipeline:variantCallingPipeline:bamstats (1)
[fc/85650c] Submitted process > pipeline:variantCallingPipeline:medakaConsensus (1)
[9b/01a2b2] Submitted process > pipeline:variantCallingPipeline:bamstats (2)
[ab/27dc2a] Submitted process > pipeline:variantCallingPipeline:medakaConsensus (2)
[ac/52ca23] Submitted process > pipeline:variantCallingPipeline:bamstats (3)
[37/220da2] Submitted process > pipeline:variantCallingPipeline:medakaConsensus (3)
[f8/e27369] Submitted process > pipeline:variantCallingPipeline:bamstats (4)
[c5/d5b7e6] Submitted process > pipeline:variantCallingPipeline:medakaConsensus (4)
[a8/291eaa] Submitted process > pipeline:variantCallingPipeline:bamstats (5)
[d9/557b23] Submitted process > pipeline:variantCallingPipeline:medakaConsensus (5)
[55/5501c2] Submitted process > pipeline:variantCallingPipeline:bamstats (6)
[6d/2a2e74] Submitted process > pipeline:variantCallingPipeline:medakaConsensus (6)
[37/ef9bb3] Submitted process > pipeline:variantCallingPipeline:medakaConsensus (7)
[d2/32eeb4] Submitted process > pipeline:variantCallingPipeline:bamstats (7)
[17/edfac1] Submitted process > pipeline:variantCallingPipeline:mosdepth (1)
[29/ffd82c] Submitted process > pipeline:variantCallingPipeline:medakaVariant (6)
[f3/5cac86] Submitted process > pipeline:variantCallingPipeline:mosdepth (2)
[aa/6f4fe4] Submitted process > pipeline:variantCallingPipeline:mosdepth (3)
[29/4f1003] Submitted process > pipeline:variantCallingPipeline:mosdepth (4)
[76/f30e6a] Submitted process > pipeline:variantCallingPipeline:mosdepth (5)
[da/c272de] Submitted process > pipeline:variantCallingPipeline:mosdepth (6)
[e4/09263a] Submitted process > pipeline:variantCallingPipeline:mosdepth (7)
[7f/18ad0a] Submitted process > pipeline:variantCallingPipeline:medakaVariant (2)
[01/67d70d] Submitted process > pipeline:variantCallingPipeline:medakaVariant (3)
[58/94ab02] Submitted process > pipeline:variantCallingPipeline:medakaVariant (7)
[1b/97f06a] Submitted process > pipeline:variantCallingPipeline:medakaVariant (4)
[10/bb811d] Submitted process > pipeline:variantCallingPipeline:medakaVariant (1)
[c2/e06af1] Submitted process > pipeline:variantCallingPipeline:medakaVariant (5)
[b5/8d3707] Submitted process > pipeline:variantCallingPipeline:concatMosdepthResultFiles (1)
[ff/0d7475] Submitted process > pipeline:variantCallingPipeline:concatMosdepthResultFiles (6)
[34/ee026c] Submitted process > pipeline:variantCallingPipeline:concatMosdepthResultFiles (3)
[f6/ad6117] Submitted process > pipeline:variantCallingPipeline:concatMosdepthResultFiles (5)
[1e/68ae67] Submitted process > pipeline:variantCallingPipeline:concatMosdepthResultFiles (4)
[4b/f251ce] Submitted process > pipeline:variantCallingPipeline:concatMosdepthResultFiles (2)
[ce/1257bd] Submitted process > pipeline:variantCallingPipeline:concatMosdepthResultFiles (7)
[c8/4379b8] Submitted process > pipeline:collectFilesInDir (1)
[b8/104b7c] Submitted process > pipeline:collectFilesInDir (2)
[76/bfeb49] Submitted process > pipeline:collectFilesInDir (3)
[04/98a560] Submitted process > pipeline:collectFilesInDir (4)
[85/59dade] Submitted process > pipeline:collectFilesInDir (5)
[6b/9e5e3a] Submitted process > pipeline:collectFilesInDir (6)
[96/7f0e03] Submitted process > pipeline:collectFilesInDir (7)
[18/528b59] Submitted process > pipeline:makeReport (1)
ERROR ~ Error executing process > 'pipeline:makeReport (1)'
Caused by:
  Process `pipeline:makeReport (1)` terminated with an error exit status (1)
Command executed:
  workflow-glue report         --report-fname wf-amplicon-report.html         --data data         --reference reference.fasta         --meta-json metadata.json         --versions versions.txt         --params params.json
Command exit status:
  1
Command output:
  (empty)
Command error:
  /home/epi2melabs/conda/lib/python3.8/site-packages/si_prefix/__init__.py:44: DeprecationWarning: invalid escape sequence \s
    u'(?P<si_unit>[%s])?\s*' % SI_PREFIX_UNITS)
  /home/epi2melabs/conda/lib/python3.8/site-packages/si_prefix/__init__.py:249: DeprecationWarning: invalid escape sequence \s
    u'(?P<si_unit>[%s])?\s*$' % SI_PREFIX_UNITS)
  [18:29:11 - workflow_glue] Starting entrypoint.
  [E::idx_find_and_load] Could not retrieve index file for 'data/barcode03/medaka.annotated.vcf.gz'
  [E::idx_find_and_load] Could not retrieve index file for 'data/barcode07/medaka.annotated.vcf.gz'
  [E::idx_find_and_load] Could not retrieve index file for 'data/barcode02/medaka.annotated.vcf.gz'
  [E::idx_find_and_load] Could not retrieve index file for 'data/barcode05/medaka.annotated.vcf.gz'
  [E::idx_find_and_load] Could not retrieve index file for 'data/barcode01/medaka.annotated.vcf.gz'
  [E::idx_find_and_load] Could not retrieve index file for 'data/barcode06/medaka.annotated.vcf.gz'
  [E::idx_find_and_load] Could not retrieve index file for 'data/barcode04/medaka.annotated.vcf.gz'
  /home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/bin/workflow_glue/report_util.py:120: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
    basic_summary["mean_length"] = self.post_trim_per_file_stats.eval(
  /home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/bin/workflow_glue/report_util.py:120: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
    basic_summary["mean_length"] = self.post_trim_per_file_stats.eval(
  /home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/bin/workflow_glue/report_util.py:120: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
    basic_summary["mean_length"] = self.post_trim_per_file_stats.eval(
  /home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/bin/workflow_glue/report_util.py:120: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
    basic_summary["mean_length"] = self.post_trim_per_file_stats.eval(
  /home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/bin/workflow_glue/report_util.py:120: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
    basic_summary["mean_length"] = self.post_trim_per_file_stats.eval(
  /home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/bin/workflow_glue/report_util.py:120: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
    basic_summary["mean_length"] = self.post_trim_per_file_stats.eval(
  Traceback (most recent call last):
    File "/home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/bin/workflow-glue", line 7, in <module>
      cli()
    File "/home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/bin/workflow_glue/__init__.py", line 62, in cli
      args.func(args)
    File "/home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/bin/workflow_glue/report.py", line 161, in main
      basic_summary = d.get_basic_summary_stats()
    File "/home/pbgl/epi2melabs/workflows/epi2me-labs/wf-amplicon/bin/workflow_glue/report_util.py", line 133, in get_basic_summary_stats
      basic_summary["overall_mean_depth"] = summed_depths.sum() / ref_lengths.sum()
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/ops/common.py", line 81, in new_method
      return method(self, other)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/arraylike.py", line 210, in __truediv__
      return self._arith_method(other, operator.truediv)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/series.py", line 6108, in _arith_method
      return base.IndexOpsMixin._arith_method(self, other, op)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/base.py", line 1348, in _arith_method
      result = ops.arithmetic_op(lvalues, rvalues, op)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/ops/array_ops.py", line 232, in arithmetic_op
      res_values = _na_arithmetic_op(left, right, op)  # type: ignore[arg-type]
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/ops/array_ops.py", line 171, in _na_arithmetic_op
      result = func(left, right)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/computation/expressions.py", line 239, in evaluate
      return _evaluate(op, op_str, a, b)  # type: ignore[misc]
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/computation/expressions.py", line 70, in _evaluate_standard
      return op(a, b)
  ZeroDivisionError: division by zero
Work dir:
  /home/pbgl/epi2melabs/instances/wf-amplicon_8100fd8a-8046-408b-b785-8720a6dbe836/work/18/528b59cbd76eb2e90ff30875139744
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
 -- Check '/home/pbgl/epi2melabs/instances/wf-amplicon_8100fd8a-8046-408b-b785-8720a6dbe836/nextflow.log' file for details

Application activity log entry

in the activity log this workflow instance ("naughty_lewin") is marked as "completed"

{
  "name": "Launch workflow",
  "description": "Begin running a new analysis",
  "updates": [
    {
      "message": "Checking runner"
    },
    {
      "message": "Java is ready to use."
    },
    {
      "message": "Nextflow is ready to use."
    },
    {
      "message": "Docker is online and ready to use."
    },
    {
      "message": "All Setup tests are passing"
    },
    {
      "message": "Creating instance: naughty_lewin"
    },
    {
      "message": "Instance workflow launched"
    },
    {
      "message": "Instance created: naughty_lewin"
    }
  ],
  "id": "b281a925-f87c-4420-b7c1-6d2f646d1732",
  "percentage": 100,
  "status": "COMPLETED",
  "createdAt": "2023-07-08T18:22:57.504Z",
  "updatedAt": "2023-07-08T18:22:59.096Z",
  "metadata": {
    "javaCmd": "/home/pbgl/epi2melabs/bin/java/bin/java",
    "javaArgs": [
      "-h"
    ],
    "javaVersion": "openjdk 18.0.2 2022-07-19\nOpenJDK Runtime Environment (build 18.0.2+9-61)\nOpenJDK 64-Bit Server VM (build 18.0.2+9-61, mixed mode, sharing)\n",
    "javaExitCode": "0",
    "nxfCmd": "/usr/lib/epi2me/resources/nextflow-all",
    "nxfArgs": [
      "-h"
    ],
    "nxfExitCode": "0",
    "dockerCmd": "docker",
    "dockerArgs": [
      "images"
    ],
    "dockerVersion": "Docker version 20.10.21, build 20.10.21-0ubuntu1~20.04.2\n",
    "dockerExitCode": "0"
  }
}
warthmann commented 1 year ago

turns out that the error does not occur when I provide a sample_sheet

julibeg commented 1 year ago

Hi there!

turns out that the error does not occur when I provide a sample_sheet

Interesting! Would it be possible to share the reference file and sample sheet?

warthmann commented 1 year ago

sure. they are attached. Note also the inconsistency in guppy were it reports BC01-BC06, but names the directories barcode01-barcode06, where wf-amplicon seems to go by the directory names.

reference.zip BC2sample.zip

julibeg commented 1 year ago

I tried various things but unfortunately was unable to reproduce the error.

The line that fails is

basic_summary["overall_mean_depth"] = summed_depths.sum() / ref_lengths.sum()

I.e. ref_lengths.sum() needs to be zero for the above to occur and I was not able to find any scenario in which this happens. Could you please double-check that you can reproduce the error using the reference file you shared.

If it happens again, could you please do the following:

Thanks!

julibeg commented 11 months ago

I'm closing this for now; please feel free to re-open if the problem occurs again!