nextstrain / WNV

the repository used to build West Nile Virus for nextstrain
https://nextstrain.org/WNV/NA
2 stars 0 forks source link

Investigate a reasonable root for a global build #26

Open j23414 opened 3 days ago

j23414 commented 3 days ago

Context

Proximately related to https://github.com/nextstrain/WNV/issues/20

Example of a global tree using midpoint rooting (needs to be fixed): https://next.nextstrain.org/staging/WNV/global

DOH-LMT2303 commented 3 hours ago

Rooting of US builds The WNV Nextstrain build for the "Twenty years of WNV in the Americas" is rooted on the Israel sequence from 1998 AF481864. This is most likely because the first detection of WNV in the Americas (New York outbreak 1999) was most closely related to the Israel 1998 isolate. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6822705/

Thus, the Israel sequence could serve as a root for a WA or other US build. However, this might not be the most appropiate root for a global build. WNV is believed to have originated in Africa, with its first discovery in Uganda in 1937 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10772404/).

In GenBank the earliest sequence available is from 1931 from Illinois which might be a date error since WNV wasn't detected in the US until 1999. The next earliest sequence available is from 1953 from Israel.

Rooting of a global tree The most informative paper I found on historical genomics of WNV is this paper "Spatial and temporal dynamics of West Nile virus between Africa and Europe" the first WNV L1 (cluster 1) strain recovered in Africa is from 1951 from Egypt (Genbank AF260968).The authors also note that "clusters 2, 3, 4, 6, and 7 are rooted by the two ancient sequences from Nigeria and Senegal" looking at figure 1. on the paper I think that those are GQ851607.1 Nigeria 1965 and GQ851606.1 Senegal 1979. "It is also shown that all strains within cluster 2 are rooted by the 1989 Senegalese strain (Genbank OP846971)" https://www.nature.com/articles/s41467-023-42185-7