nextstrain / dengue

Nextstrain build for dengue virus
https://nextstrain.org/dengue
8 stars 10 forks source link

Setting an outgroup for the "Dengue virus DENVx genotypes" dataset #67

Closed j23414 closed 1 month ago

j23414 commented 1 month ago

Description of proposed changes

This pull request aims to fine-tune the Dengue virus Denv (`dengue/denv) datasets for accurate genotype-level assignment. Similar to a previous PR ([https://github.com/nextstrain/dengue/pull/58](https://github.com/nextstrain/dengue/pull/58)) for the Dengue virus All (dengue/all`) dataset, it addresses the issue of cross-serotype samples being falsely assigned to genotypes within a specific serotype. For example, when querying all samples against a DENV4 dataset, there were false-positive "DENV4II" genotype calls.

Screenshot 2024-06-03 at 1 33 41 PM

To resolve this issue, the following actions were taken, inspired by a suggestion from @rneher in a Slack channel:

Results

The resulting trees with minimized cross-serotype false-positive genotype-level assignments are documented in https://github.com/nextstrain/dengue/issues/69

Related issue(s)

Checklist

The dataset has been pushed to PR https://github.com/nextstrain/nextclade_data/pull/203 and is available for testing at the links in the PR comment https://github.com/nextstrain/nextclade_data/pull/203#issuecomment-2130070919

j23414 commented 1 month ago

Merging this so I can continue working on feedback from slack in a new issue and PR.