nextstrain / mpox

Nextstrain build for mpox virus
https://nextstrain.org/mpox
MIT License
42 stars 19 forks source link

ingest: include `url` field #76

Open joverlee521 opened 2 years ago

joverlee521 commented 2 years ago

Context

See https://github.com/nextstrain/monkeypox/pull/72#issuecomment-1162442758

Possible solution

GenBank urls can be specially added as https://www.ncbi.nlm.nih.gov/nuccore/<genbank_accession> URLs for arbitrary non-GenBank sequences will have to be added through manual annotations.

jameshadfield commented 2 years ago

The logic here is a bit confusing (I'm documenting this now).

If "accession": "AY741551" is switched to "genbank_accession": {"value": "AY741551"} then auspice will automagically add the GenBank URL 😉 This is the best approach.

Otherwise we can switch to the following:

node_attrs: {
  accession: "AY74155",
  url: "https://www.ncbi.nlm.nih.gov/nuccore/AY74155"
}

Note that if we choose the latter, then neither "genbank_accession" nor "gisaid_epi_isl" can appear as a node_attr. However different nodes can use different approaches, which will be helpful if we have non-genbank strains.

joverlee521 commented 2 years ago

Yup, I was going through this logic in Auspice and figured the accession/url combo would be best for monkeypox to support non-GenBank sequences.