Closed jameshadfield closed 2 months ago
@lmoncla @trvrb full genome build is working well and is up at https://nextstrain.org/avian-flu/h5n1-cattle-outbreak/genome
Note that fauna only has ~29 strains of this clade, so you have to use a different data source for this build. It's also a little tricky to run on AWS as it needs input files from the main Snakemake build, but it runs in about 5min on a laptop so I suggest running everything locally for now.
One thing we have to be cognisant of is updating the include strains list as new sequences arrive, which obviously won't scale but we can iterate on this. Perhaps here a hamming-distance include-new-seqs approach would work, but I haven't thought about it much.
This is working well for me. I like the general strategy here as well. I'm going to go ahead and merge this. Thank you @jameshadfield!
~Main TODO is to join the genbank files & translate~
~Tree available at https://nextstrain.org/staging/avian-flu/h5n1/selected-strains-genome~