epi2me-labs / wf-bacterial-genomes

Small variant calling for haploid samples
https://labs.epi2me.io/
Other
26 stars 8 forks source link

[Bug]: Problem with Metaquast-part of report #8

Closed lknegendorf closed 1 year ago

lknegendorf commented 1 year ago

What happened?

Hi Team,

When running the workflow, it stops (either when using reference-based assembly or de novo assembly) with an error because the report cannot be compiled. This might be due to missing reference files for metaquast (see log output). I am using a proxy server for internet connection. Is there any workaround for this? I did not find the folder mentioned in the error message to manually provide the reference file.

Thank you very much in advance!

Operating System

ubuntu 20.04

Workflow Execution

EPI2ME Labs desktop application

Workflow Execution - EPI2ME Labs Versions

4.1.4

Workflow Execution - CLI Execution Profile

Docker

Workflow Version

v0.2.12

Relevant log output

Workflow execution completed unsuccessfully!

The exit status of the task that caused the workflow execution to fail was: 1.

The full error message was:

Error executing process > 'calling_pipeline:makeReport'

Caused by:
  Process `calling_pipeline:makeReport` terminated with an error exit status (1)

Command executed:

  workflow-glue report     --prokka      --versions versions     --params params.json     --output wf-bacterial-genomes-report.html     --sample_ids 2639127-28052-01

Command exit status:
  1

Command output:
  (empty)

Command error:
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2804: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2804: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('repoze')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2804: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('ruamel')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)
  /home/epi2melabs/conda/lib/python3.8/site-packages/pkg_resources/__init__.py:2804: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('ruamel')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)
  /home/epi2melabs/conda/lib/python3.8/site-packages/BCBio/GFF/GFFParser.py:66: DeprecationWarning: invalid escape sequence \w
    gff3_kw_pat = re.compile("\w+=")
  [09:08:04 - workflow_glue] Starting entrypoint.
  Traceback (most recent call last):
    File "/home/nanopore/epi2melabs/workflows/epi2me-labs/wf-bacterial-genomes/bin/workflow-glue", line 7, in 
      cli()
    File "/home/nanopore/epi2melabs/workflows/epi2me-labs/wf-bacterial-genomes/bin/workflow_glue/__init__.py", line 62, in cli
      args.func(args)
    File "/home/nanopore/epi2melabs/workflows/epi2me-labs/wf-bacterial-genomes/bin/workflow_glue/report.py", line 275, in main
      species_stats = run_species_stats(
    File "/home/nanopore/epi2melabs/workflows/epi2me-labs/wf-bacterial-genomes/bin/workflow_glue/report.py", line 146, in run_species_stats
      species_data = pd.read_csv(species_data_path, sep='\t', comment="#")
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/util/_decorators.py", line 211, in wrapper
      return func(*args, **kwargs)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper
      return func(*args, **kwargs)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 950, in read_csv
      return _read(filepath_or_buffer, kwds)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 605, in _read
      parser = TextFileReader(filepath_or_buffer, **kwds)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1442, in __init__
      self._engine = self._make_engine(f, self.engine)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1735, in _make_engine
      self.handles = get_handle(
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/common.py", line 856, in get_handle
      handle = open(
  FileNotFoundError: [Errno 2] No such file or directory: 'quast_stats/quast_downloaded_references/blast.res_2639127-28052-01-medaka'

Work dir:
  /home/nanopore/epi2melabs/instances/wf-bacterial-genomes_278008ab-61d9-4c96-8600-5e4e4c7d1453/work/d7/31ebbb6a71721aa9651db0b7fb89e2

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

And also Metaquast.log:

/home/epi2melabs/conda/bin/metaquast.py -o quast_output -t 1 2639127-28052-01.medaka.fasta.gz

Version: 5.2.0

System information:
  OS: Linux-5.15.0-71-generic-x86_64-with-glibc2.10 (linux_64)
  Python version: 3.8.15
  CPUs number: 48

Started: 2023-05-03 08:59:05

Logging to /home/nanopore/epi2melabs/instances/wf-bacterial-genomes_278008ab-61d9-4c96-8600-5e4e4c7d1453/work/06/ae06638763ae742c301cf76dd9e858/quast_output/metaquast.log

Contigs:
  Pre-processing...
  2639127-28052-01.medaka.fasta.gz ==> 2639127-28052-01.medaka

No references are provided, starting to search for reference genomes in SILVA 16S rRNA database and to download them from NCBI...

2023-05-03 08:59:08

Downloading SILVA 16S ribosomal RNA gene database (version 138.1)...

ERROR! Failed downloading SILVA 16S rRNA gene database (http://www.arb-silva.de/fileadmin/silva_databases/release_138.1/Exports/SILVA_138.1_SSURef_NR99_tax_silva.fasta.gz)! The search for reference genomes cannot be performed. Try to download it manually, put under /home/epi2melabs/conda/lib/python3.8/site-packages/quast_libs/silva/ and restart your command.
Reference genomes are not found.

NOTICE: No references are provided, starting regular QUAST with MetaGeneMark gene finder
sarahjeeeze commented 1 year ago

Hi, this error should not occur in the latest version - v0.2.12.

mattdmem commented 1 year ago

quast was removed.