nextstrain / zika

Nextstrain build for Zika virus
https://nextstrain.org/zika
8 stars 10 forks source link

Use a consistent "country" name across USVI samples (both submitted to GenBank and not-yet) #40

Closed j23414 closed 4 months ago

j23414 commented 4 months ago

Current Behavior

From slack:

the 'country' of those 26 is "Usvi" whereas the 5 samples originating from GenBank (e.g. accession "MW165884") have country of "Virgin Islands"

Can also see it here on the live site:

Expected behavior

That the country names would be consistent for both USVI that has been submitted to GenBank and not-yet-submitted to GenBank.

How to reproduce

Possible solution

From slack:

I think there are two options, either of which is better than the current approach which uses a mixture:

  1. Change genbank to "USVI" during curate. From memory this was what Alli always used and so I think this is the intended country name. (It should be all-caps, but this isn't a dealbreaker.)
  2. Change the spiked-in sequences to "Virgin Islands" to match GenBank. We'll need to add lat/longs for this as it's not in Augur's defaults and thus doesn't show in Auspice.