monarch-initiative / phenomics_first_resource

Project Management repository for anything in and around the Phenomics First Resource (PFR)
0 stars 0 forks source link

Aim 2.3.1 Computational comparison #27

Open sagehrke opened 2 years ago

sagehrke commented 2 years ago

Objective

​This will use the disease reconciliation and ongoing equivalency candidate reports (Aim 2.1), as well as the disease model and existing annotations for phenotypes, variants, and other attributes (Aim 2.2). The computational model of any new proposed disease will be computationally compared with existing diseases, aiding the curator in their decision-making and helping to refine and differentiate the new disease definition. These computational comparisons will be available via APIs (see Aim 4) and accessible through any existing platform, such as ClinGen or Orphanet (see ​Letters​).

cmungall commented 1 year ago

No progress, dependent on #13

matentzn commented 1 year ago

For this paper here, there are at least two things we could do that I can think off:

  1. Work with C-Path to study how different gene annotations for diseases are across resources like OMIM and Orphanet (very nice semsim based)
  2. Write the paper Mondo: Reconciling conflicting classifications which studies the harmonisation of disease classes in ontologies.

Note that the way the #25 goal is phrased, its unclear if the instance of the disease model is some kind of KG (everything known about a disease, i.e. classification, genes, phenotypes, treatments etc) or TBoxy model (where you would only see a massive abstraction of the above, like defining genes, pathogens etc, and parentage, and perhaps metadata like naming if you want to count that as a "part of the model").

If we decide the "disease model" is something more KGy, then it will not be as easy to "compare" across resources, as in the KG, well, everything is harmonised to Mondo.

These computational comparisons will be available via APIs (see Aim 4) and accessible through any existing platform, such as ClinGen or Orphanet

I think some of this can be embedded directly in Mondo, for example, we can more rigorously include "excluded subclass of" to make differences explicit. Now, this only works for a small subset of a general disease model, mainly its "isa structure". Maybe that is enough?