theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
33 stars 15 forks source link

update to latest nextclade_dataset release for SC2 `2024-04-15--15-08-22Z` #414

Closed kapsakcj closed 2 months ago

kapsakcj commented 2 months ago

Will update this PR message later.

This PR closes #.

🗑️ This dev branch should be deleted after merging to main.

:brain: Aim, Context and Functionality

Update to latest nextclade_dataset tag that was released for SARS-CoV-2 today: https://github.com/nextstrain/nextclade_data/releases/tag/2024-04-15--15-08-22Z

:hammer_and_wrench: Impacted Workflows/Tasks & Changes Being Made

This will impact all workflows that run nextclade on sars-cov-2 samples (so all TheiaCoV workflows)

This will affect the behavior of the workflow(s) even if users don’t change any workflow inputs relative to the last version : Yes

Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing: No

:clipboard: Workflow/Task Step Changes

🔄 Data Processing

Docker/software or software versions changed: N/A

Databases or database versions changed: String sc2_nextclade_ds_tag = "2024-02-16--04-00-32Z" updated to ➡️ String sc2_nextclade_ds_tag = "2024-04-15--15-08-22Z"

Data processing/commands changed: N/A

File processing changed: N/A

Compute resources changed: N/A

➡️ Inputs

N/A

⬅️ Outputs

Will impact results of nextclade as far as the nextclade_clade and nextclade_pango_lineage outputs go since there are newer pango lineages and nextclade clades included with the update:

:test_tube: Testing

Test Dataset

Early Omicron sars-cov-2 samples pulled from GISAID

Commandline Testing with MiniWDL or Cromwell (optional)

did not test locally.

Terra Testing

sars-cov-2 samples from GISAID through theiacov_fasta: https://app.terra.bio/#workspaces/theiagen-validations/curtis-sandbox-theiagen-validations/job_history/404343fc-2f76-4502-96ae-84f62ee96a9e

Suggested Scenarios for Reviewer to Test

sars-cov-2 samples run through the nextclade v3 task (can be any workflow [FASTA, ILMN_PE, ILMN_SE, etc.] that uses organism_parameters workflow to set the nextclade dataset tag for sars-cov-2)

Theiagen Version Release Testing (optional)

:microscope: Final Developer Checklist

🎯 Reviewer Checklist

🗂️ Associated Documentation (to be completed by Theiagen developer)

kapsakcj commented 2 months ago

Testing via theiacov_fasta ran as expected with this new nextclade_dataset tag:

image

✅ Successful workflow here: https://app.terra.bio/#workspaces/theiagen-validations/curtis-sandbox-theiagen-validations/job_history/404343fc-2f76-4502-96ae-84f62ee96a9e