monarch-initiative / mondo-ingest

Coordinating the mondo-ingest with external sources
https://monarch-initiative.github.io/mondo-ingest/
5 stars 3 forks source link

Build on `develop` and `main` fails - undeclared synonym type: http://purl.obolibrary.org/obo/mondo#abbreviation #655

Open twhetzel opened 3 days ago

twhetzel commented 3 days ago

The full build on develop fails with the error below.

The synonym type and abbreviation properties in the omim.owl from the 15-Sep-2024 Tagged release are incorrect as well.

OMIM -- Incorrect IRIs

http://www.geneontology.org/formats/oboInOwl#hasSynonymType
http://purl.obolibrary.org/obo/mondo#abbreviation

MONDO - Correct IRIs

http://www.geneontology.org/formats/oboInOwl#SynonymTypeProperty
http://purl.obolibrary.org/obo/mondo#ABBREVIATION

@joeflack4 can you check if this error is because of the incorrect IRIs in the released omim files compared to what they should be in Mondo or if this is because the omim files are using the Mondo specific abbreviation property, e.g. http://purl.obolibrary.org/obo/mondo#ABBREVIATION instead of the OBO standard property of http://purl.obolibrary.org/obo/OMO_0003000 or some additional issue (e.g. the correct IRI needs to be in a config file)?

Build Error Log from `develop` ``` ... python3 ../scripts/migrate.py \ --ontology-path components/omim.owl \ --mondo-mappings-path tmp/mondo.sssom.tsv \ --onto-config-path metadata/omim.yml \ --mapping-status-path reports/omim_mapping_status.tsv \ --min-id 850056 \ --max-id 999999 \ --mondo-terms-path reports/mirror_signature-mondo.tsv \ --slurp-dir-path slurp/ \ --outpath slurp/omim.tsv /usr/local/lib/python3.10/dist-packages/pronto/ontology.py:283: NotImplementedWarning: cannot process plain `owl:AnnotationProperty` cls(self).parse_from(_handle) # type: ignore Traceback (most recent call last): File "/work/src/ontology/../scripts/migrate.py", line 284, in cli() File "/work/src/ontology/../scripts/migrate.py", line 280, in cli slurp(**d) File "/work/src/ontology/../scripts/migrate.py", line 79, in slurp ontology: ProntoImplementation = _load_ontology(ontology_path, use_cache) File "/work/src/scripts/utils.py", line 174, in _load_ontology ontology = ProntoImplementation(OntologyResource(slug=ontology_path, local=True)) # ~17 sec File "", line 22, in __init__ File "/usr/local/lib/python3.10/dist-packages/oaklib/implementations/pronto/pronto_implementation.py", line 178, in __post_init__ ontology = Ontology(str(resource.local_path), **kwargs) File "/usr/local/lib/python3.10/dist-packages/pronto/ontology.py", line 283, in __init__ cls(self).parse_from(_handle) # type: ignore File "/usr/local/lib/python3.10/dist-packages/pronto/parsers/rdfxml.py", line 119, in parse_from self._process_axiom(axiom, curies) File "/usr/local/lib/python3.10/dist-packages/pronto/parsers/rdfxml.py", line 822, in _process_axiom for s in entity.synonyms File "/usr/local/lib/python3.10/dist-packages/pronto/entity/__init__.py", line 465, in synonyms return frozenset(Synonym(ontology, s) for s in termdata.synonyms) File "/usr/local/lib/python3.10/dist-packages/pronto/entity/__init__.py", line 465, in return frozenset(Synonym(ontology, s) for s in termdata.synonyms) File "/usr/local/lib/python3.10/dist-packages/pronto/synonym.py", line 133, in __init__ raise ValueError(f"undeclared synonym type: {syndata.type}") ValueError: undeclared synonym type: http://purl.obolibrary.org/obo/mondo#abbreviation make[1]: *** [mondo-ingest.Makefile:507: slurp/omim.tsv] Error 1 make[1]: Leaving directory '/work/src/ontology' make: *** [mondo-ingest.Makefile:346: build-mondo-ingest] Error 2 Command exited with non-zero status 2 ### DEBUG STATS ### Elapsed time: 1:00:47 ```
Build Error Log from `main` ``` ... python3 ../scripts/migrate.py \ --ontology-path components/omim.owl \ --mondo-mappings-path tmp/mondo.sssom.tsv \ --onto-config-path metadata/omim.yml \ --mapping-status-path reports/omim_mapping_status.tsv \ --min-id 850056 \ --max-id 999999 \ --mondo-terms-path reports/mirror_signature-mondo.tsv \ --slurp-dir-path slurp/ \ --outpath slurp/omim.tsv /usr/local/lib/python3.10/dist-packages/pronto/ontology.py:283: NotImplementedWarning: cannot process plain `owl:AnnotationProperty` cls(self).parse_from(_handle) # type: ignore Traceback (most recent call last): File "/work/src/ontology/../scripts/migrate.py", line 278, in cli() File "/work/src/ontology/../scripts/migrate.py", line 274, in cli slurp(**d) File "/work/src/ontology/../scripts/migrate.py", line 73, in slurp ontology: ProntoImplementation = _load_ontology(ontology_path, use_cache) File "/work/src/scripts/utils.py", line 174, in _load_ontology ontology = ProntoImplementation(OntologyResource(slug=ontology_path, local=True)) # ~17 sec File "", line 22, in __init__ File "/usr/local/lib/python3.10/dist-packages/oaklib/implementations/pronto/pronto_implementation.py", line 178, in __post_init__ ontology = Ontology(str(resource.local_path), **kwargs) File "/usr/local/lib/python3.10/dist-packages/pronto/ontology.py", line 283, in __init__ cls(self).parse_from(_handle) # type: ignore File "/usr/local/lib/python3.10/dist-packages/pronto/parsers/rdfxml.py", line 119, in parse_from self._process_axiom(axiom, curies) File "/usr/local/lib/python3.10/dist-packages/pronto/parsers/rdfxml.py", line 822, in _process_axiom for s in entity.synonyms File "/usr/local/lib/python3.10/dist-packages/pronto/entity/__init__.py", line 465, in synonyms return frozenset(Synonym(ontology, s) for s in termdata.synonyms) File "/usr/local/lib/python3.10/dist-packages/pronto/entity/__init__.py", line 465, in return frozenset(Synonym(ontology, s) for s in termdata.synonyms) File "/usr/local/lib/python3.10/dist-packages/pronto/synonym.py", line 133, in __init__ raise ValueError(f"undeclared synonym type: {syndata.type}") ValueError: undeclared synonym type: http://purl.obolibrary.org/obo/mondo#abbreviation make[1]: *** [mondo-ingest.Makefile:510: slurp/omim.tsv] Error 1 make[1]: Leaving directory '/work/src/ontology' make: *** [mondo-ingest.Makefile:345: build-mondo-ingest] Error 2 Command exited with non-zero status 2 ### DEBUG STATS ### Elapsed time: 44:46.97 ```
joeflack4 commented 3 days ago

I think it's because the OMIM repo is using the lowercase abbreviation accidentally, but I'm looking into this.

joeflack4 commented 2 days ago

So it looks like this problem is happening now because in the most recent ODK, it is using pronto==2.5.7 (my local is 2.5.5 and I don't get this error until I upgrade to that).

So it looks like there are 3 things that need to be fixed:

  1. In omim, there needs to be an annotation prop declaration for mondo#ABBREVIATION: https://github.com/monarch-initiative/omim/pull/144
  2. In mondo-ingest, http://purl.obolibrary.org/obo/mondo#ABBREVIATION needs to be added to properties.txt
  3. In mondo-ingest (or in omim), mondo#ABBREVIATION needs to be declared as a sub-property of oio:synonymType.

Regarding (3), it seems to make sense to do that in omim, but I haven't taken a stab at it yet.

Interestingly through in mondo-ingest, when you look at components/omim.owl, you see this:

    <owl:AnnotationProperty rdf:about="http://purl.obolibrary.org/obo/mondo#GENERATED">
        <rdfs:subPropertyOf rdf:resource="http://www.geneontology.org/formats/oboInOwl#SynonymTypeProperty"/>
    </owl:AnnotationProperty>

But this does not exist in the component-download-omim.owl.owl. So it appears this is being added during the goal $(COMPONENTSDIR)/omim.owl, but I can't see where it's doing that.

Anyway I think I should add the sub-property in the omim repo instead.

twhetzel commented 2 days ago

@joeflack4 can you use http://purl.obolibrary.org/obo/OMO_0003000 in the OMIM repo for the abbreviation and then add a step when the component is built in the mondo-ingest repo to change from http://purl.obolibrary.org/obo/OMO_0003000 to http://purl.obolibrary.org/obo/mondo#ABBREVIATION and any other needed mondo-ingest changes? This is my preferred solution.

This should follow what is being done with DOID where it uses OMO_0003012 (acronym) and these values should be treated as abbreviations in the synonym sync.

#GENERATED is being added in $(COMPONENTSDIR)/omim.owl by --update ../sparql/fix-labels-with-brackets.ru \

joeflack4 commented 2 days ago

We can do that as well, if you wish.

I just asked Claude, and the synonym type declaration appears to just be a one liner: graph.add((custom_property, RDFS.subPropertyOf, synonym_type))

Though I of course can't say 100% that it'd fix it; I think so though.

But if this is what you prefer, I'll definitely do that.

If it's what other sources are doing, that also makes it a good idea. And it is consistent with the OMIM repo of using a wide variety of already existing properties in the OBO universe (e.g. lots from RO).

joeflack4 commented 2 days ago

#GENERATED is being added in $(COMPONENTSDIR)/omim.owl by --update ../sparql/fix-labels-with-brackets.ru \

Ah yes, that's right! Thanks.