midas-network / midas-data

An ontology for MIDAS data types
1 stars 2 forks source link

HIV/AIDS #53

Closed LucieContamin closed 10 months ago

LucieContamin commented 12 months ago

Following the issue https://github.com/midas-network/midas-data/issues/49 ,

Influenza was added to the ontology with:

Would it be possible to update the pathogen to have:

We don't always know the details of the virus and it will match Tycho ontology (https://www.tycho.pitt.edu/dataset/US.62479008/)

Please let me know if any issues or need more information, thanks!

hoganwr commented 11 months ago

Same issue as for influenza viruses: HIV 1 and HIV 2 have as common parent Lentivirus (not HIV unqualified)

hoganwr commented 11 months ago
Screenshot 2023-08-02 at 4 08 08 PM

Note the "Lineage"

LucieContamin commented 11 months ago

Thanks for the information,

My issue here is that we want to be able to represent "HIV" data and the HIV data we want to add in the catalog does not contain strain level information (HIV-1 or HIV-2), only "HIV" data. My understanding is that HIV1 is more common and HIV2 is maybe rare outside of Western Africa. It seems to also be possible for an individual to contract both (healthline, WebMD, wikipedia).

Some examples of data to include in our Data Catalog contains the Western Africa region: • https://worldhealthorg.shinyapps.io/hsi-dashboard-wpr/https://www.hiv.gov/hiv-basics/overview/data-and-trends/statistics/https://aho.afro.who.int/ind/af?ind=104&cc=af&ci=1&dim=73&dom=HIV/AIDS%20incidence

So, I think the current version of the ontology works for disease but might be an issue for the pathogen as we don't know if HIV in the source represents HIV1, HIV2 or both, and the ontology only contains HIV1.

As I understand the taxonomy of the Lentivirus is:

I think the problem here is slightly different than the issue with FLU, because for FLU we can assume we have either one of or both A and B but for HIV, depending on the location we cannot assume the pathogen information. So, I think switching to "HIV (12721)" from the unclassified group makes sense as we don't know how to classify the data.

harryhoch commented 10 months ago

@hoganwr ... discussed with @LucieContamin. Proposal is that we include all three HIV (11676, 11709 and 12721), all referring to the same disease (AIDS) when appropriate.