Closed jrotieno closed 6 months ago
In TheiaCoV the nextclade_dataset_name
refers to the Nextclade organism dataset
, which is a controlled vocabulary.
I'm afraid that if we name this input dataset_name
it's going to cause confusion if one is used to running TheiaCoV, which seems likely.
I don't have a better name for the input organism so I'm going to wait for the next sync with the team to discuss this a bit further. :) Sorry @jrotieno! Right now I don't have a better alternative.
Code changes look solid ๐๐ป
Good point, the simple solution is to call it nextclade_dataset_name
for uniformity. Let me do that.
โ Testing 10 Flu HA samples in https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Mendes_Sandbox/job_history/945bfc27-c46a-4d6d-858a-429947399cc5
This PR closes #<302>.
๐๏ธ This dev branch should be deleted after merging to main.
:brain: Aim, Context and Functionality
This change is to avoid confusion between Theiagen's
organism
input for the various workflows and what was named asorganism
input in theSamples_To_Ref_Tree_PHB
workflow, but is ideally thenextclade_dataset_name
. For example,flu
is accepted asorganism
input for theTheiaCoV
workflows, but what is expected as theorganism
input inSamples_To_Ref_Tree_PHB
isflu_yam_ha
, orflu_h1n1pdm_ha
, etc, depending on the subtype and segment.:hammer_and_wrench: Impacted Workflows/Tasks & Changes Being Made
This will affect the behavior of the workflow(s) even if users donโt change any workflow inputs relative to the last version : No
Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : No
:clipboard: Workflow/Task Step Changes
๐ Data Processing
Docker/software or software versions changed: No
Databases or database versions changed: No
Data processing/commands changed: No
File processing changed: No
Compute resources changed: No
โก๏ธ Inputs
Input name changed from
organism
tonextclade_dataset_name
โฌ ๏ธ Outputs
No Changes
:test_tube: Testing
Test Dataset
A multifasta dataset comprising of nextclade pathogens/datasets, i.e. seasonal iInfluenza A H1N1 and H3N2, Influenza B Yamagata and Victoria, RSV-A and RSV-B, MPXV and SARS-CoV-2.
Commandline Testing with MiniWDL or Cromwell (optional)
Not undertaken/not necessary.
Terra Testing
All samples successful as expected. https://app.terra.bio/#workspaces/cdph-terrabio-taborda-manual/Global_tree_testing/job_history/7c66ad72-086f-4f72-b825-3cc6e71527d1
Suggested Scenarios for Reviewer to Test
Test with inaccurate
nextclade_dataset_name
!Theiagen Version Release Testing (optional)
:microscope: Final Developer Checklist
๐ฏ Reviewer Checklist
๐๏ธ Associated Documentation (to be completed by Theiagen developer)