CDCgov / phoenix

🔥🐦🔥PHoeNIx: A short-read pipeline for healthcare-associated and antimicrobial resistant pathogens
Apache License 2.0
55 stars 19 forks source link

Running on Terra fails with a pigz not found error #125

Closed cvaske-clear closed 12 months ago

cvaske-clear commented 12 months ago

Describe the bug

When running the WDL worfklow PHoeNIx v2 on Terra, execution does not complete, seemingly due to a lack of pigz.

Impact This prevents the workflow from running

To Reproduce

  1. Upload this kraken2 database into the workspace files
  2. Copy the URI of the uploaded kraken2 database into a workspace variable
  3. Import this Dockstore workflow into Terra: https://dockstore.org/workflows/github.com/CDCgov/phoenix/phoenix:v2.0.1
  4. Launch a new workflow from FASTQ files, with the workflow set to CDC_PHOENIX

Expected behavior Successful completion of the workflow

Screenshots The relevant portion of the nextflow log appears to be here:


Caused by:
  Process `CDC_PHOENIX:PHOENIX_EXQC:ASSET_CHECK (1)` terminated with an error exit status (127)

Command executed:

  if [[ REFSEQ_20230504_Bacteria_complete.msh.gz = *.gz ]]
  then
      pigz -vdf REFSEQ_20230504_Bacteria_complete.msh.gz
  else
      :
  fi

  if [[ mlst_db.tar.gz = *.tar.gz ]]
  then
      tar --use-compress-program="pigz -vdf" -xf mlst_db.tar.gz
  else
      :
  fi

  if [[ k2_standard_08gb_20230605.tar.gz = *.tar.gz ]]
  then
      folder_name=$(basename k2_standard_08gb_20230605.tar.gz .tar.gz)
      tar --use-compress-program="pigz -vdf" -xf k2_standard_08gb_20230605.tar.gz
      mkdir ${folder_name}_folder
      mv *.kmer_distrib ${folder_name}_folder
      mv *.k2d ${folder_name}_folder
      mv seqid2taxid.map ${folder_name}_folder
      mv inspect.txt ${folder_name}_folder
      mv ktaxonomy.tsv ${folder_name}_folder
  else
      folder_name=$(basename k2_standard_08gb_20230605.tar.gz .tar.gz)
      mv ${folder_name} ${folder_name}_folder
  fi

  cat <<-END_VERSIONS > versions.yml
  "CDC_PHOENIX:PHOENIX_EXQC:ASSET_CHECK":
      phoenix_base_container: base_v2.0.0
  END_VERSIONS

Command exit status:
  127

Command output:
  (empty)

Command error:
  .command.sh: line 4: pigz: command not found

Logs

Phoenix.log attached phoenix.log.txt

Additional context

It appears that this version of the pipeline calls the docker image quay.io/jvhagey/phoenix:2.0.0. There is no pigz executable in this docker container. However, the image quay.io/jvhagey/phoenix:2.0.2 does have pigz. Sometimes WDL workflows allow selection of the docker image as an input string, but I don't believe PHoeNIx does. If it did, I could probably test the fix a bit more easily!

jvhagey commented 12 months ago

Hi @cvaske-clear is there a reason you need to use the v2.0.1 version? v2.0.2 has the fixed container in it and I suggest you use the latest version.

cvaske-clear commented 12 months ago

Honestly, I didn't see the 2.0.2 when selecting the version. I will use that!