INCATools / ontology-development-kit

Bootstrap an OBO Library ontology
http://incatools.github.io/ontology-development-kit/
BSD 3-Clause "New" or "Revised" License
219 stars 54 forks source link

Consider replacing some of the SPARQL QC checks with OAK validate #610

Open matentzn opened 2 years ago

matentzn commented 2 years ago

OAK has a new but very experimental validation interface:

https://incatools.github.io/ontology-access-kit/interfaces/validator.html

We should start exploring that for QC rather than adding sparql. Good first issue:

587

@anitacaron would this be an issue you'd be interested in? For starters, the only thing we need is:

  1. An example of a working OAK command (CLI) which takes in an ontology and validates it.
  2. An implementation of #587 extending the ontology metadata schema provided by OAK (we will maintain our own here for now).
  3. A make goal in Makefile.jinja2 that runs the validation on $(ONT).owl.
  4. We should build-in support for multiple profiles. @StroemPhi rather than extending existing profiles, I would suggest exploring their composition. Maybe the ODK config should get this option:
oak_validate:
    profiles:
        - filename: profile1.yml
           description: Monarch validation (Monarch QC)
           mirror_from: 'http://...'
         - filename: profile2.yml
           description: OMO validation (OBO QC)
           mirror_from: 'http://...'

This will result in a make goal like this:

reports/profile1.tsv: $(FILE) tmp/profile1.yml
    oak validate $< --profile  tmp/profile1.yml -o $@

reports/profile2.tsv: $(FILE) tmp/profile2.yml
    oak validate $< --profile  tmp/profile2.yml -o $@

oak_validate:
    $(MAKE_FAST) reports/profile1.tsv reports/profile2.tsv
matentzn commented 1 year ago

@anitacaron, just FYI: @StroemPhi will be doing some work preparing the actual profiles. Could you work with him to get the relevant makefile extensions done?

StroemPhi commented 1 year ago

Just set up a fresh repo with the cookie-cutter linkml template using the ontology-metadata.yaml from OAK here: https://github.com/StroemPhi/Ontology-Metadata

StroemPhi commented 1 year ago

Context info: I tried to understand the OAK ontoloy-metadata linkml schema draft in order to be able to reuse/extend it for the envisioned multiple different profiles approach. Unfortunately, I failed to do so, as the schema is too complex for me to grasp because it defines metadata fields for the ontology, term and axiom level.

Thus the above linked LinkML schema would not be suitable for the issue here, as it now only focuses on the metadata requirements we need for our TIB terminology service. We will soon focus more on this taks of being able to validate the required ontology level metadata as part of our ingest process. I'm not sure what technical implementation we will end up using (OAK, LinkML, JSON validation, ...). But we are closely following what you (OBO) and Clement and his team are doing with MOD and its DCAT profile in AgroPortal/BioPortal aiming to be as interoperable as possible.

matentzn commented 1 year ago

Thanks for keeping us posted!

matentzn commented 6 months ago

@anitacaron

Can we move this to the next release 1.6? I think we should discuss at least some details before providing a draft pipeline for this (although I think what you have is very close). I think OAK itself needs some modifications for this to be truly useful.

anitacaron commented 6 months ago

Yes, no worries, that's why I changed the PR to draft.