INCATools / ontology-development-kit

Bootstrap an OBO Library ontology
http://incatools.github.io/ontology-development-kit/
BSD 3-Clause "New" or "Revised" License
228 stars 54 forks source link

Add QC check for COB alignment #779

Open matentzn opened 1 year ago

matentzn commented 1 year ago

Add sparql query cob-alignment to ODK, which checks that all classes using the ontology BASE namespaces are subsumed under COB. This is a very important issue to push forward.

anitacaron commented 1 year ago

The ontology needs to import COB first, right?

matentzn commented 1 year ago

Related to: https://github.com/OBOFoundry/COB/issues/66

I think a thought should be made where such a check should live. Even a simple python library like.

CobTest.complies('cl.owl') or some such may be in scope for this task.

We need to think about how to best deliver cob - do we require people to import it (like BFO) or do we deploy it as an independent test? Both are possible.

@cmungall has written some integration tests - I think the easiest way to think of it conceptually is:

  1. Merge COB and ONTOLOGY (whether imported or not, download if need be)
  2. Check that all base entities are rdfs:subClassOf some class allowed by COB.

A trivial, not really beautiful, way of doing this would something like this:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX omop_types: <https://w3id.org/cpont/omop/types/>

SELECT DISTINCT ?entity ?property ?label WHERE {
  { ?entity a owl:Class . }
  MINUS
  { 
     SELECT ?entity WHERE {
       ?entity a owl:Class .
       ?entity rdfs:subClassOf* ?parent .
       VALUES ?parent { obo:UBERON_0001062 obo:NCBITaxon_131567 obo:HP_0012823 obo:MONDO_0021125 obo:MONDO_0000001 obo:NCIT_C16564 obo:MAXO_0000001 obo:HP_0000118 obo:PATO_0000001 obo:NCIT_C17049 obo:NCIT_C20993 obo:NCBITaxon_10239 omop_types:omop_datamodel_concept }
    }
  } .
  VALUES ?property { rdfs:label }
  ?entity ?property ?label .
}

(From @ehartley, a totally unrelated project). This query is probably a good way to start, but a python toolkit would be a good second evolution.