monarch-initiative / mondo-ingest

Coordinating the mondo-ingest with external sources
https://monarch-initiative.github.io/mondo-ingest/
6 stars 3 forks source link

Divide `oboInOwl:hasExactSynonym` SPARQL queries on components #528

Open joeflack4 opened 4 months ago

joeflack4 commented 4 months ago

Overview

We discussed the robot sparql update fix-labels-with-brackets.ru (1, 2) `, which is currently part of the component goal for every source.

Sub-task list

I think the action items are:

Sub-task details

1. Only apply this update query to OMIM and ICD10CM

Nico wrote:

As far as I can remember the main use case is not OMIM - its ICD 10 CM, where we have diseases in the name with the ICD 10 CM code in brackets. This source (ICD10CM) should retain it. ... So only remove the “remove brackets content” from the other (non-OMIM, non ICD10CM) sources.

2. For all other sources, apply a new query

Nico wrote:

Also we do need some pipeline to migrate rdfs:labels and obo:IAO_0000118 to oboInOwl:hasExactSynonym so that labels in the external resources are synchronised as synonyms during the synonym sync. So only remove the “remove brackets content” from the other (non-OMIM, non ICD10CM) sources, but make sure the labels and IAO:118 annotations are added anyways (could be added by another workflow, I don't remember but doubt it).

joeflack4 commented 4 months ago

@matentzn @twhetzel I hope I understood correctly what should be done here. I assigned medium urgency to this.

twhetzel commented 4 months ago

Yes, that is my understanding from the Slack conversation.

For (2), here is a SPARQL query (add_label_as_synonym.sparql) that will take the value of the rdfs:label and IAO_0000118 properties and add these as exact synonyms to the class with the synonym type GENERATED.

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX oboInOwl: <http://www.geneontology.org/formats/oboInOwl#>
PREFIX obo: <http://purl.obolibrary.org/obo/>

INSERT {
  <http://purl.obolibrary.org/obo/mondo#GENERATED> rdf:type owl:AnnotationProperty .
  <http://www.geneontology.org/formats/oboInOwl#hasSynonymType> rdf:type owl:AnnotationProperty .
  <http://purl.obolibrary.org/obo/mondo#GENERATED> rdfs:subPropertyOf <http://www.geneontology.org/formats/oboInOwl#SynonymTypeProperty>  .

  ?cls oboInOwl:hasExactSynonym ?synonym .

    [   rdf:type owl:Axiom ;
        owl:annotatedSource ?cls ;
        owl:annotatedProperty oboInOwl:hasExactSynonym ;
        owl:annotatedTarget ?synonym ;
        oboInOwl:hasSynonymType <http://purl.obolibrary.org/obo/mondo#GENERATED> ].
}

WHERE {
    VALUES ?property {
        rdfs:label
        obo:IAO_0000118
    }
    ?cls ?property ?label .
    BIND(STR(?label) AS ?synonym)
}