When working with clade assignments in genome metadata files provided by Nextstrain (for example), we also need to save the information that would later allow someone to retrieve the reference tree that was in effect at the same point in time.
I.e., what reference tree was used by the ncov pipeline that created the metadata.tsv file used to determine which clades to forecast in a modeling round?
For example, the upcoming variant nowcast hub will use Nextstrain's latest sars-cov-2 metadata to generate a list of clades to model for each round. When scoring a round, we'll need the reference tree that was used to create the round's list of clades to model.
Background
When working with clade assignments in genome metadata files provided by Nextstrain (for example), we also need to save the information that would later allow someone to retrieve the reference tree that was in effect at the same point in time.
I.e., what reference tree was used by the ncov pipeline that created the
metadata.tsv
file used to determine which clades to forecast in a modeling round?For example, the upcoming variant nowcast hub will use Nextstrain's latest sars-cov-2 metadata to generate a list of clades to model for each round. When scoring a round, we'll need the reference tree that was used to create the round's list of clades to model.
Definition of done
virus-clade-utils
has a new function that returns the latest Nextstrainmetadata_version.json