Open bsweger opened 4 days ago
I believe the output we want here is Nextstrain metadata_version information, which looks like this:
{
"schema_version":"v1",
"nextclade_version":"nextclade 3.8.2",
"nextclade_dataset_name":"SARS-CoV-2",
"nextclade_dataset_version":"2024-07-17--12-57-03Z",
"nextclade_tsv_sha256sum":"5dd39f2291c890d5adbcc18c513016bfd0dab8d7bffc581b7e4c0bdf4306bb89",
"metadata_tsv_sha256sum":"55f6768787193f309fb08f47270143fca7f126bc6fc38db910ad7d75ffeeba01"
}
The nextclade_dataset_version
is what we need to request the corresponding reference tree.
When creating the clade list, it's likely sufficient to capture the current metadata_version.json
.
Background
When generating and saving a list of clades to model, we should also get and save information that would be required to get the reference tree in use at the time (for reproducibility).
Definition of done
src/get_clades_to_model.py
requests the information required for reproducibility (see related issue)