epi2me-labs / wf-clone-validation

Other
25 stars 18 forks source link

Plannotate error executing process #49

Closed ChristopherRichie closed 1 month ago

ChristopherRichie commented 5 months ago

Operating System

Windows 10

Other Linux

centos Rocky Linux 8.7 (Green Obsidian)

Workflow Version

v1.2.0

Workflow Execution

EPI2ME Desktop application

EPI2ME Version

No response

CLI command run

filetype="pod5"
barcode=$param1
exp_number=$param2
approx_size=$param3
reference=$param4
bc="BC"${barcode/barcode/""}
output=$bc"_basecalling"
MY_SAMPLE=$bc
source="./pod5_pass/"$barcode
#source="./fast5_pass/"$barcode 
clone_validation_output=$bc"_clone_validation"

nextflow run epi2me-labs/wf-clone-validation -r v1.2.0 \ -c //data/chrisr/nextflow.config.Qi \ -profile biowulflocal \ --basecaller_cfg $basecaller_config \ --fastq "./"$output"/"$bc'_sorted_wf-basecalling.fq.gz' \ --out_dir $clone_validation_output \ --db_directory //data/chrisr/wf-clone-validation-db \ --threads 14

Workflow Execution - CLI Execution Profile

None

What happened?

|||||||||| _ __ _ __ ____ |||||||||| | ____| _ | | \/ | __| | | | | ||||| | | | |) | | ) | |\/| | _| ___| |/ ` | ' \/ | ||||| | |_| /| | / /| | | | |_|| | (| | |) _ \ |||||||||| |____|_| |_|___|| ||| ||\,|._/|/ |||||||||| wf-clone-validation v1.2.0-g2c04b9d

Core Nextflow options revision : v1.2.0 runName : intergalactic_hodgkin containerEngine: singularity launchDir : /gpfs/gsfs12/users/chrisr/CAM0237/no_sample/20240607_1721_MC-110461_APQ435_7b46463a workDir : /gpfs/gsfs12/users/chrisr/CAM0237/no_sample/20240607_1721_MC-110461_APQ435_7b46463a/work projectDir : /home/chrisr/.nextflow/23.04.1/assets/epi2me-labs/wf-clone-validation userName : chrisr profile : biowulflocal configFiles : /home/chrisr/.nextflow/23.04.1/assets/epi2me-labs/wf-clone-validation/nextflow.config, /data/chrisr/nextflow.config.Qi

Input Options fastq : ./BC06_basecalling/BC06_sorted_wf-basecalling.fq.gz

Output Options out_dir : BC06_clone_validation

Advanced Options db_directory : //data/chrisr/wf-clone-validation-db

Miscellaneous Options threads : 14

!! Only displaying parameters that differ from the pipeline defaults !!

If you use epi2me-labs/wf-clone-validation for your analysis please cite:


This is epi2me-labs/wf-clone-validation v1.2.0-g2c04b9d.

WARN: Nextflow version 23.04.1 does not match workflow required version: >=23.04.2 -- Execution will continue, but things may break! Searching input for [.fastq, .fastq.gz, .fq, .fq.gz] files. executor > local (2) executor > local (2) executor > local (2) executor > local (16) [cd/21dd09] process > fastcat (1) [100%] 1 of 1 ✔ [a7/3e7a8b] process > pipeline:checkIfEnoughReads (1) [100%] 1 of 1 ✔ [df/b587e1] process > pipeline:assembleCore (1) [100%] 1 of 1 ✔ [4f/dd33e7] process > pipeline:lookup_medaka_model (1) [100%] 1 of 1 ✔ [ad/1f278b] process > pipeline:medakaPolishAssembly (1) [100%] 1 of 1 ✔ [63/763b83] process > pipeline:downsampledStats (1) [100%] 1 of 1 ✔ [f0/166092] process > pipeline:findPrimers (1) [100%] 1 of 1 ✔ [64/e4ce4e] process > pipeline:medakaVersion [100%] 1 of 1 ✔ [62/ebfedb] process > pipeline:flyeVersion [100%] 1 of 1 ✔ [82/ec3703] process > pipeline:getVersions [100%] 1 of 1 ✔ [cc/792f5b] process > pipeline:getParams [100%] 1 of 1 ✔ [9e/ebb9c0] process > pipeline:inserts [100%] 1 of 1 ✔ [60/675690] process > pipeline:assembly_qc (1) [100%] 1 of 1 ✔ [5c/ce8e45] process > pipeline:runPlannotate (1) [ 0%] 0 of 1 [b6/b87e3b] process > pipeline:assemblyMafs (1) [100%] 1 of 1 ✔ [- ] process > pipeline:report - [92/700a3d] process > output (1) [100%] 1 of 1 ERROR ~ Error executing process > 'pipeline:runPlannotate (1)'

Caused by: Process pipeline:runPlannotate (1) terminated with an error exit status (1)

Command executed:

if [ -e "assemblies/OPTIONAL_FILE" ]; then assemblies="" else assemblies="--sequences assemblies/" fi workflow-glue run_plannotate $assemblies --database wf-clone-validation-db

Command exit status: 1

Command output: (empty)

Command error: /home/epi2melabs/conda/lib/python3.8/site-packages/plannotate/annotate.py:218: DeprecationWarning: invalid escape sequence | problem_name = "pdb|(.*)|" [13:38:23 - workflow_glue] Starting entrypoint. 2024-06-10 13:38:23.632 Warning: to view this Streamlit app on a browser, run it with the following command:

Relevant log output

Jun-10 13:38:03.540 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 12; name: pipeline:assemblyMafs (1); status: COMPLETED; exit: 0; error: -; workDir: /gpfs/gsfs12/users/chrisr/CAM0237/no_sample/20240607_1721_MC-110461_APQ435_7b46463a/work/b6/b87e3b0c76cd77acc3bc8f6191ac42]
Jun-10 13:38:25.265 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 16; name: pipeline:inserts; status: COMPLETED; exit: 0; error: -; workDir: /gpfs/gsfs12/users/chrisr/CAM0237/no_sample/20240607_1721_MC-110461_APQ435_7b46463a/work/9e/ebb9c0b6c748c74c711917435f8481]
Jun-10 13:38:25.274 [Task monitor] DEBUG nextflow.processor.TaskProcessor - Process pipeline:inserts > Skipping output binding because one or more optional files are missing: fileoutparam<0>
Jun-10 13:38:25.589 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 15; name: pipeline:runPlannotate (1); status: COMPLETED; exit: 1; error: -; workDir: /gpfs/gsfs12/users/chrisr/CAM0237/no_sample/20240607_1721_MC-110461_APQ435_7b46463a/work/5c/ce8e45faefa5ad8388082bc6cc606d]
Jun-10 13:38:25.592 [Task monitor] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=pipeline:runPlannotate (1); work-dir=/gpfs/gsfs12/users/chrisr/CAM0237/no_sample/20240607_1721_MC-110461_APQ435_7b46463a/work/5c/ce8e45faefa5ad8388082bc6cc606d
  error [nextflow.exception.ProcessFailedException]: Process `pipeline:runPlannotate (1)` terminated with an error exit status (1)
Jun-10 13:38:25.602 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'pipeline:runPlannotate (1)'

Caused by:
  Process `pipeline:runPlannotate (1)` terminated with an error exit status (1)

Command executed:

  if [ -e "assemblies/OPTIONAL_FILE" ]; then
      assemblies=""
  else
      assemblies="--sequences assemblies/"
  fi
  workflow-glue run_plannotate $assemblies --database wf-clone-validation-db

Command exit status:
  1

Command output:
  (empty)

Command error:
  /home/epi2melabs/conda/lib/python3.8/site-packages/plannotate/annotate.py:218: DeprecationWarning: invalid escape sequence \|
    problem_name = "pdb\|(.*)\|"
  [13:38:23 - workflow_glue] Starting entrypoint.
  2024-06-10 13:38:23.632 
    Warning: to view this Streamlit app on a browser, run it with the following
    command:

      streamlit run /home/chrisr/.nextflow/23.04.1/assets/epi2me-labs/wf-clone-validation/bin/workflow-glue [ARGUMENTS]
  Traceback (most recent call last):
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/streamlit/runtime/legacy_caching/caching.py", line 678, in get_or_create_cached_value
      return_value = _read_from_cache(
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/streamlit/runtime/legacy_caching/caching.py", line 435, in _read_from_cache
      raise e
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/streamlit/runtime/legacy_caching/caching.py", line 420, in _read_from_cache
      return _read_from_mem_cache(
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/streamlit/runtime/legacy_caching/caching.py", line 337, in _read_from_mem_cache
      raise CacheKeyNotFoundError("Key not found in mem cache")
  streamlit.runtime.legacy_caching.caching.CacheKeyNotFoundError: Key not found in mem cache

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "/home/chrisr/.nextflow/23.04.1/assets/epi2me-labs/wf-clone-validation/bin/workflow-glue", line 7, in <module>
      cli()
    File "/home/chrisr/.nextflow/23.04.1/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/__init__.py", line 72, in cli
      args.func(args)
    File "/home/chrisr/.nextflow/23.04.1/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/run_plannotate.py", line 230, in main
      tup_dic, report, plannotate_dic = attempt_annotation(
    File "/home/chrisr/.nextflow/23.04.1/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/run_plannotate.py", line 200, in attempt_annotation
      tup_dic, clean_df = per_assembly(sample_file, name)
    File "/home/chrisr/.nextflow/23.04.1/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/run_plannotate.py", line 99, in per_assembly
      plot, annotations, clean_df = run_plannotate(sample_file)
    File "/home/chrisr/.nextflow/23.04.1/assets/epi2me-labs/wf-clone-validation/bin/workflow_glue/run_plannotate.py", line 22, in run_plannotate
      df = annotate(
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/plannotate/annotate.py", line 344, in annotate
      blastDf = get_raw_hits(query, linear, yaml_file)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/streamlit/runtime/legacy_caching/caching.py", line 717, in wrapped_func
      return get_or_create_cached_value()
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/streamlit/runtime/legacy_caching/caching.py", line 694, in get_or_create_cached_value
      return_value = non_optional_func(*args, **kwargs)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/plannotate/annotate.py", line 282, in get_raw_hits
      hits = BLAST(seq = query, db = database)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/plannotate/annotate.py", line 41, in BLAST
      inDf = parse_infernal(tmp.name)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/plannotate/infernal.py", line 12, in parse_infernal
      col_widths = [len(ele)+1 for ele in lines[1].split()]
  IndexError: list index out of range

Work dir:
  /gpfs/gsfs12/users/chrisr/CAM0237/no_sample/20240607_1721_MC-110461_APQ435_7b46463a/work/5c/ce8e45faefa5ad8388082bc6cc606d

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
Jun-10 13:38:25.609 [Task monitor] DEBUG nextflow.Session - Session aborted -- Cause: Process `pipeline:runPlannotate (1)` terminated with an error exit status (1)
Jun-10 13:38:25.611 [Task monitor] DEBUG nextflow.Session - The following nodes are still active:
[process] output
  status=ACTIVE
  port 0: (queue) closed; channel: fname
  port 1: (cntrl) -     ; channel: $

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

clone_val_test  clone_val_test.tar.gz
[chrisr@cn0843 test_data]$ nextflow run epi2me-labs/wf-clone-validation \
> --fastq clone_val_test/fastq --primers clone_val_test/primers.tsv \
> --host_reference clone_val_test/host_reference.fa.gz --regions_bedfile clone_val_test/reference.bed \
> --insert_reference clone_val_test/insert_reference.fasta --sample_sheet clone_val_test/sample_sheet.csv \
> -profile standard
N E X T F L O W  ~  version 23.04.1
Project `epi2me-labs/wf-clone-validation` is currently stickied on revision: v1.2.0 -- you need to explicitly specify a revision with the option `-r` in order to use it
[chrisr@cn0843 test_data]$ nextflow run epi2me-labs/wf-clone-validation -r v1.2.0 --fastq clone_val_test/fastq --primers clone_val_test/primers.tsv --host_reference clone_val_test/host_reference.fa.gz --regions_bedfile clone_val_test/reference.bed --insert_reference clone_val_test/insert_reference.fasta --sample_sheet clone_val_test/sample_sheet.csv -profile standard
N E X T F L O W  ~  version 23.04.1
NOTE: Your local project version looks outdated - a different revision is available in the remote repository [d3b8d21b6e]
Launching `https://github.com/epi2me-labs/wf-clone-validation` [drunk_descartes] DSL2 - revision: 2c04b9d884 [v1.2.0]

||||||||||   _____ ____ ___ ____  __  __ _____      _       _
||||||||||  | ____|  _ \_ _|___ \|  \/  | ____|    | | __ _| |__  ___
|||||       |  _| | |_) | |  __) | |\/| |  _| _____| |/ _` | '_ \/ __|
|||||       | |___|  __/| | / __/| |  | | |__|_____| | (_| | |_) \__ \
||||||||||  |_____|_|  |___|_____|_|  |_|_____|    |_|\__,_|_.__/|___/
||||||||||  wf-clone-validation v1.2.0-g2c04b9d
--------------------------------------------------------------------------------
Core Nextflow options
  revision        : v1.2.0
  runName         : drunk_descartes
  containerEngine : docker
  launchDir       : /gpfs/gsfs12/users/chrisr/CAM0237/no_sample/20240607_1721_MC-110461_APQ435_7b46463a/test_data
  workDir         : /gpfs/gsfs12/users/chrisr/CAM0237/no_sample/20240607_1721_MC-110461_APQ435_7b46463a/test_data/work
  projectDir      : /home/chrisr/.nextflow/23.04.1/assets/epi2me-labs/wf-clone-validation
  userName        : chrisr
  profile         : standard
  configFiles     : /home/chrisr/.nextflow/23.04.1/assets/epi2me-labs/wf-clone-validation/nextflow.config

Input Options
  fastq           : clone_val_test/fastq
  primers         : clone_val_test/primers.tsv

Reference Genome Options
  insert_reference: clone_val_test/insert_reference.fasta
  host_reference  : clone_val_test/host_reference.fa.gz
  regions_bedfile : clone_val_test/reference.bed

Sample Options
  sample_sheet    : clone_val_test/sample_sheet.csv

!! Only displaying parameters that differ from the pipeline defaults !!
--------------------------------------------------------------------------------
If you use epi2me-labs/wf-clone-validation for your analysis please cite:

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

--------------------------------------------------------------------------------
This is epi2me-labs/wf-clone-validation v1.2.0-g2c04b9d.
--------------------------------------------------------------------------------
WARN: Nextflow version 23.04.1 does not match workflow required version: >=23.04.2 -- Execution will continue, but things may break!
WARN: Overriding the approx size parameter with per sample approx sizes provided by the sample_sheet.
Searching input for [.fastq, .fastq.gz, .fq, .fq.gz] files.
executor >  local (4)
[c0/f838a5] process > validate_sample_sheet            [100%] 1 of 1, failed: 1 ✘
[-        ] process > fastcat                          -
[-        ] process > pipeline:checkIfEnoughReads      -
[-        ] process > pipeline:filterHostReads         -
[-        ] process > pipeline:assembleCore            -
[69/eb33ec] process > pipeline:lookup_medaka_model (1) [100%] 1 of 1, failed: 1 ✘
[-        ] process > pipeline:medakaPolishAssembly    -
[-        ] process > pipeline:downsampledStats        -
[-        ] process > pipeline:findPrimers             -
[79/d1a8a9] process > pipeline:medakaVersion           [100%] 1 of 1, failed: 1 ✘
[-        ] process > pipeline:flyeVersion             -
[-        ] process > pipeline:getVersions             -
[be/4ca9fd] process > pipeline:getParams               [100%] 1 of 1, failed: 1 ✘
[-        ] process > pipeline:inserts                 -
[-        ] process > pipeline:assembly_qc             -
[-        ] process > pipeline:insert_qc               -
[-        ] process > pipeline:runPlannotate           -
[-        ] process > pipeline:assemblyMafs            -
[-        ] process > pipeline:report                  -
[-        ] process > output                           -
ERROR ~ Error executing process > 'pipeline:medakaVersion'

Caused by:
  Process `pipeline:medakaVersion` terminated with an error exit status (127)

Command executed:

  medaka --version | sed 's/ /,/' >> "medaka_version.txt"

Command exit status:
  127

Command output:
  (empty)

Command error:
  .command.run: line 290: docker: command not found

Work dir:
  /gpfs/gsfs12/users/chrisr/CAM0237/no_sample/20240607_1721_MC-110461_APQ435_7b46463a/test_data/work/79/d1a8a935d50933717a50a4acfb4944

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

 -- Check '.nextflow.log' file for details

[chrisr@cn0843 test_data]$
sarahjeeeze commented 5 months ago

Hi, Run again without this parameter --db_directory //data/chrisr/wf-clone-validation-db - if you do want to supply the database you first need to download it but the same database is now included in the workflow by default.

ChristopherRichie commented 4 months ago

my script worked after i removed that vestigal parameter.... thanks.

I am getting "good" results with version 1.3.1. can you direct me to where the "default database" is located?

the "INPUTS" table indicates that "db_directory" is a tar.gz, but I am unable to locate it in the github folders. I would like to see how the default db is organized.

thanks

sarahjeeeze commented 4 months ago

A copy of it is here, internally it is part of the package files for plannotate. https://github.com/mmcguffi/pLannotate/releases/download/v1.2.0/BLAST_dbs.tar.gz

sarahjeeeze commented 1 month ago

Closing after lack of response