theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
37 stars 17 forks source link

[Augur_PHB] Update ncov repo commit and remove reference input to augur clades task #330

Closed sage-wright closed 7 months ago

sage-wright commented 8 months ago

This PR closes #234!

🗑️ This dev branch should be deleted after merging to main.

:brain: Aim, Context and Functionality

:hammer_and_wrench: Impacted Workflows/Tasks & Changes Being Made

This will affect the behavior of the workflow(s) even if users don’t change any workflow inputs relative to the last version : Yes

Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : No

:clipboard: Workflow/Task Step Changes

🔄 Data Processing

Docker/software or software versions changed:

Databases or database versions changed:

in the set_sc2_defaults task, the nextstrain_ncov_repo_commit was updated from "23d1243127e8838a61b7e5c1a72bc419bf8c5a0d" to "cec4fa0ecd8612e4363d40662060a5a9c712d67e" which differ in about 11 months of age.

Data processing/commands changed:

In task_augur_clades.wdl, the augur_clades task had previously used the --references "{reference_fasta}" input, but this was not implemented in the augur clades command, leading to a warning.

File processing changed:

Compute resources changed:

➡️ Inputs

The optional input nextstrain_ncov_repo_commit was updated to "cec4fa0ecd8612e4363d40662060a5a9c712d67e" and will affect all users if the input is left blank.

⬅️ Outputs

The clades_tsv file will now contain clades (intermediate file, not sent to Terra table), and the auspice_input_json file will now be able to be colored by clade.

:test_tube: Testing

Test Dataset

We tested on 7 SARS-CoV-2 Omicron samples from different Omicron clades.

Command-line Testing with MiniWDL or Cromwell (optional)

Command-line testing did not occur.

Terra Testing

This job demonstrates the correct output behavior from augur clades.

Suggested Scenarios for Reviewer to Test

More SC2 samples. Additionally, flu and mpox should be tested as well.

Theiagen Version Release Testing (optional)

In the release, this PR will require:

:microscope: Final Developer Checklist

🎯 Reviewer Checklist

🗂️ Associated Documentation (to be completed by Theiagen developer)

kapsakcj commented 8 months ago

Reminder to myself: notify one laboratory (PR) that might be using this when PR is merged & dev branch is deleted

jrotieno commented 7 months ago

@sage-wright getting an error:

Error updating config
Failed to process workflow definition 'augur' (reason 1 of 1): Failed to process 'call clades_task.augur_clades' (reason 1 of 1): The call supplied a value 'reference_fasta' that doesn't exist in the task (or sub-workflow)

This has to do with the reference_fasta input commented out in the clades_task. Should that be reverted back?

sage-wright commented 7 months ago

No, it shouldn't be reverted since it was not used. We should remove the reference in the workflow.

sage-wright commented 7 months ago

@jrotieno, I fixed it!