ontodev / robot

ROBOT is an OBO Tool
http://robot.obolibrary.org
BSD 3-Clause "New" or "Revised" License
260 stars 74 forks source link

Test for subset definition integrity? #565

Open pbuttigieg opened 5 years ago

pbuttigieg commented 5 years ago

Hi all

Over at ENVO, @easr noted that many of our subsets were not declared in the OWL (and thus OBO) headers. I didn't notice this, but it causes issues for @easr's software: https://github.com/EnvironmentOntology/envo/issues/871

Perhaps an automated check can be bundled in to validate that subsets are properly declared?

relates to #255

beckyjackson commented 4 years ago

Hi @pbuttigieg - the verify command allows you to declare your own custom rules to validate your ontology and does not come with any preset queries.

If you'd like to add a robot verify step, the query could be something like:

Verify that all values of oboInOwl:inSubset are IRIs (verify-inSubset-IRIs.rq):

PREFIX oboInOwl: <http://www.geneontology.org/formats/oboInOwl#>

SELECT DISTINCT ?Subset WHERE {
    ?s oboInOwl:inSubset ?Subset .
    FILTER(!isIRI(?Subset))
}

Verify that all IRIs used by oboInOwl:inSubset are defined (verify-inSubset-declared.rq):

PREFIX oboInOwl: <http://www.geneontology.org/formats/oboInOwl#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>

SELECT DISTINCT ?Subset WHERE {
    ?s oboInOwl:inSubset ?Subset .
    FILTER(isIRI(?Subset))
    FILTER NOT EXISTS { ?Subset a owl:AnnotationProperty }
}

And then run:

robot verify --input envo.owl \
  --queries verify-inSubset-IRIs.rq verify-inSubset-declared.rq \
  --output-dir results/

These could easily be added to the Makefile:

VERIFY_QUERIES := $(wildcard sparql/verify-*.rq) 

.PHONY: verify
verify: envo.owl $(VERIFY_QUERIES) | build/robot.jar
    $(ROBOT) verify --input $< --output-dir build/results \
    --queries $(VERIFY_QUERIES)

That said, since this is a widely-used OBO Foundry standard for subsets, should we consider adding these checks to report? On the other hand, this is more of a legacy from OBO format. @jamesaoverton @cmungall

cmungall commented 4 years ago

That said, since this is a widely-used OBO Foundry standard for subsets, should we consider adding these checks to report? On the other hand, this is more of a legacy from OBO format. @jamesaoverton https://github.com/jamesaoverton @cmungall https://github.com/cmungall

Yes, I think this would be really useful in report. Not sure it's really legacy, the inSubset property is used by many widely used ontologies