Closed twhetzel closed 6 months ago
Joe - here is the SPARQL query. I'll leave the code update/branch management, etc. for you.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?term
WHERE {
{
{
?s1 ?p1 ?term .
?term rdfs:subClassOf* <http://id.who.int/icd/entity/455013390> .
}
UNION
{
?term ?p2 ?o2 .
?term rdfs:subClassOf* <http://id.who.int/icd/entity/455013390> .
}
}
FILTER(isIRI(?term))
FILTER NOT EXISTS {
?term rdfs:subClassOf* <http://id.who.int/icd/entity/979408586> .
}
}
Here is the hierarchy view of ICD11 Foundation from the WHO site to show why subclasses of Extension Codes
are being excluded. The branch has terms that are not relevant, but would otherwise be included since Extension Codes
also have ICD Category
as a parent. In addition, there are terms with the same exact label, but different IRI as terms in the "ICD Category" branch so the "Extension Codes" terms would show up as exact lexical matches, which is a file that receives less curator review.
Thanks for creating the query for me! The screenshot is also helpful. Going to add this and begin a new build to main
.
Update
src/sparql/icd11foundation-relevant-signature.sparql
to exclude any class that is a child of the Extension Code branch.