broadinstitute / viral-pipelines

viral-ngs: complete pipelines
Other
51 stars 28 forks source link

scaffolding regression fixes plus docker updates #547

Closed dpark01 closed 1 month ago

dpark01 commented 1 month ago

This PR:

  1. fixes a bug #543 introduced recently that causes different/unintended failure modes in assemble_denovo on example exercises. This PR now has the scaffold task fallback to old brute-force reference selection if ANI-based reference selection fails to find any matches at all. This will work fine (same as before) in most historically-normal use cases, but will not behave well if given a very large array of reference genomes to choose from. (this change does not impact the behavior of scaffold_and_refine_multitaxa which does not pass multiple references to the scaffold task anyway.
  2. Updates viral-core and viral-assemble docker images to latest (2.3.2 and 2.3.2.0)
  3. Updates cromwell, womtool, and dxCompiler versions for build
  4. minor cleanups to docbuild
  5. drops any direct use of util.file.zstd_open in favor of zstandard.open
  6. increase default RAM for reports.alignment_metrics
  7. bugfix some bed file sorting for samtools ampliconstats in the case of multi-segment targets -- also allow samtools ampliconstats to fail silently in the case that it's getting too picky about the bed file
  8. change the mafft workflow to use multi_align_mafft_ref instead of multi_align_mafft
  9. make the terra_tsv_to_table workflow resilient to non-existent TSVs in its input array, to simplify running it on a Terra table full of tsvs that may or may not exist
  10. Some initial WiP code for multi-species genbank prep, not far along yet.
tomkinsc commented 1 month ago

The merge of this to master triggered a deployment to dnanexus, which is currently failing to compile via dxWDL: https://github.com/broadinstitute/viral-pipelines/actions/runs/10237505135/job/28320935779 The issue seems to be with tasks_megablast.wdl. The checks in this PR didn't catch it because deployment to dx is currently only triggered after a merge into master. I'm looking into it and may change actions behavior so we actually test deployment to dx in PRs.