Open cmungall opened 2 years ago
The idea is good, but we are still lacking a consistent framework for implementing such checks. SPARQL is very cumbersome, especially if we want to generalise this to:
for all A in O, CT_SUBCLASS = SUM(B sub A) PARENTS_OF_CHILDREN = {C: SUM(C super (B sub A))/CT_SUBCLASS } if PARENTS_OF_CHILDREN.PREVALENCE > 0.75: suggest A subClassOf PARENTS_OF_CHILDREN.CLASS
Should OAK have a better interface for specifying tests like this, or should we just keep adding stand-alone python scripts - especially if we want to resuse stuff like this?
example: primary bone dysplasia should likely be classified as Mendelian
this has many descendants:
most of these are in OMIM:
and in fact the majority of these are either parents of OMIMs or something that likely should be a genetic disease:
We should add a QC check: if a grouping is not under inherited yet the majority of the leaf nodes are, then flag for checking
Here majority can be something chosen by curator, maybe 90% or maybe it should be more of a statistical test, we can refine later
A note on general applicability to OBO: this kind of inductive reasoning step is a good counterpart to the deductive reasoning we do