DataONEorg / sem-prov-ontologies

Ontologies focused on scientific observations and scientific workflow provenance.
https://ontologies.dataone.org
17 stars 7 forks source link

Release first version of MOSAiC ontology #88

Closed amoeba closed 3 years ago

amoeba commented 3 years ago

@mpsaloha and @laijasmine are nearly ready with the first version of the MOSAiC ontology. This'll go in a branch and we'll want to merge to main and cut a GH release.

Re: Decide on place to host the ontology...

We've been hosting ontologies like ECSO on BioPortal and dereferencing our URIs there. Unfortunately, MOSAiC uses lots of Instances it looks like BioPortal doesn't yet have great support for this. We might be better off with something like a PyLODE page or hosting our own OLS or triplestore. I think the easiest thing would be to start with a PyLODE page and see how far that gets us.

Where we "host" the ontology could be multiple places. One could be for de-referencing (PyLODE) and another, for example, could be for building web interfaces upon (like we do now with BioPortal).

amoeba commented 3 years ago

@mpsaloha sent me a v1 draft but indicated that we wanted to materialize inferred axioms in the OWL file. He was having trouble getting the Protégé "Export inferred axioms as ontology..." feature to run at all.

I tried it on my end and it ran fine and materialized the kinds of inferred axioms we were looking for. But then we noticed the result is a bit funny: It seems like it copies from some annotations from the imported ontologies. See this gist of the differences. Note I had to remove the provone import for robot to run the diff so you won't see a line item in the diff showing the missing import. 95+% of what's in the diff looks good so I'm thinking we can just do some minor touch-up here.

On another note, it looks like SSN and SOSA follow a pattern that looks really nice. They offer up the following annotations:

Would it make sense for us to add these? We have some of this info embedded in rdfs:comments but I think more specialized annotations would be better.

amoeba commented 3 years ago

I took Mark's last version, exported an inferred copy, re-added PROVONE, SSN, and SOSA imports, and took an initial attempt at some of the annotations in the list above (dcterms:title, etc). I'll touch base with @mpsaloha and the rest of the team tomorrow to try wrapping things up.

amoeba commented 3 years ago

Oh, and I threw up a PyLODE page for the latest copy of the draft at https://60dd1c46a436a937e61a6874--reverent-austin-ec584b.netlify.app if anyone wants to check it out. You can see some quick things that we might tweak, like creators, a better description, and an overview image.

amoeba commented 3 years ago

We're close but not quite there yet on this. @mpsaloha did some more work and sent me an update. I made two commits:

  1. Remove the extra top level statements Protégé is dead set on making when we export as inferred: https://github.com/DataONEorg/sem-prov-ontologies/commit/9a1ed73f6b9f9a10edb705ced4f522a6f7ad9bba
  2. Make some editorial changes, mainly breaking apart unstructured info such as creators/contributors into standalone statements https://github.com/DataONEorg/sem-prov-ontologies/commit/0ce43c8320b2e104a8a58125a3ce70adbcecbda8

A few open questions remain:

  1. https or http URIs/
    • @mbjones mentions some concern that browser vendors may continue to make it harder to use http URLs so choosing https URLs sort of future proofs our IRIs.
    • Most ontologies you can find in the wild use http and it feels cleaner to me in the sense that http/https is messy anyway and conflates concepts
  2. Should "MOSAiC" be mixed case or not in our URIs? Case matters both in the sense of URIs and in the sense of practical considerations. Few ontologies I can find use any capital letters but for ECSO we went all-caps. Maybe to be consistent with ourselves we should go all caps?
  3. I think it's cleanest when the namespace prefix matches the ontology IRI. Right now our IRI is http://purl.dataone.org/odo/MOSAiC/ but a natural prefix URI for our terms would be http://purl.dataone.org/odo/MOSAiC_. Should we tackle this?
mbjones commented 3 years ago

Trying to follow new proposed contributing guidelines, I renamed the branch from MOSAiC to feature-88-mosaic and I would like to delete the original branch, but let's discuss.

amoeba commented 3 years ago

Before I make the above changes, I wanted to check and see if there were usages of MOSAiC in the wild and I see four: https://search.dataone.org/cn/v2/query/solr/?q=sem_annotation:*MOSAiC*&fl=id,sem_annotation. I'm going to coordinate with @laijasmine to make sure we can update those annotations.

Edit: We also decided today to try locally versioning the ontology into two versions that stay in sync: An uninferred/raw version and the fully-inferred one. The latter would be the official copy and the copy we'd distribute and the former would only be kept in git. Updates to the ontology would go in two places.

This is pretty tricky but we're hoping it'll be more beneficial than it is painful. @mpsaloha thinks the inferred triples are critical to the serialized ontology. So far our experience with using Protégé to export inferred axioms has been that it doesn't work if you've already exported so we need to keep a pre-inferred copy around to (1) edit, (2) re-infer + export.

amoeba commented 3 years ago

@mpsaloha sent me the pre-inferred copy today and I re-inferred the full copy and made all the URI tweaks above. I wrote up a quick readme in the folder https://github.com/DataONEorg/sem-prov-ontologies/tree/feature-88-mosaic/MOSAiC.

I'm going to touch base with @mpsaloha so he can look at the final copy but I think we can move forward from here.

amoeba commented 3 years ago

The MOSAiC ontology is ready to be merged onto main in https://github.com/DataONEorg/sem-prov-ontologies/pull/94. GitHub is telling me a PR onto main needs a reviews. @mbjones could we remove that restriction? At any given time, we'll likely only have one person on staff (me) that has their head deep enough in this stuff to do a real review.

mbjones commented 3 years ago

Restriction removed, and I submitted an approving review. I think this issue can be closed now.

amoeba commented 3 years ago

This issue still has a few remaining items to be done but I realize that others might prefer we break them out into their own issues. I'll do that now and close this when done.

amoeba commented 3 years ago

The remaining tasks here have been broken out into their own issues. See updated OP.