DataONEorg / sem-prov-ontologies

Ontologies focused on scientific observations and scientific workflow provenance.
17 stars 7 forks source link

Fold Mark's latest changes into a new MOSAiC release #104

Closed amoeba closed 3 years ago

amoeba commented 3 years ago

Mark sent along his latest changes to the raw version of MOSAiC which address some things we've discovered since the 1.0.0 release. I'll incorporate them, re-export an inferred copy, and hit all the switches to get the newer version up.

amoeba commented 3 years ago

@mpsaloha sent over changes via Slack. Here's the ROBOT diff:

18 axioms in left ontology but not in right ontology:
- Annotation(rdfs:comment "Developed with Protege 5.5.0\nInferences using Pellet 2.2.0"^^xsd:string)
- AnnotationAssertion(<> <> <>)
- AnnotationAssertion(<> <> <>)
- AnnotationAssertion(<> <> "2021-05-05T06:11:27Z"^^xsd:dateTime)
- AnnotationAssertion(<> <> "2021-05-05T06:12:04Z"^^xsd:dateTime)
- AnnotationAssertion(<> <> "The \"_MOSAiC Specific Term\" class is a container to organize and isolate the terms that are most frequently used to describe the various components of the MOSAiC Expedition.\n\nSince  other established ontologies are imported into this one, a number of extraneous, non-MOSAiC relevant, and unused terms may be present and clutter the presentation.  Hopefully, simply presenting this one Class for reference will enable users to enjoy the main advantages of using and exploring this Ontology.\n\nDefinitions or descriptions of these term as described on the Pangaea website are provided in the Annotation fields associated with each MOSAIC term.\n\nThe main patterns semantically modeled here are as such:\n\nThere are 9 Campaigns, which correspond most closely to a \"Cruise\", or \"Leg\" although campaigns can involve stationary or aerial platforms.\n\nEach Campaign has a Basis, which is typically a Research Vessel (e.g. the Polarstern) or Aircraft (Polar 5 & 6).\n\nEach Campaign has one or more Chief Scientists, and a Research Location.\n\nThe Basis of a Campaign, and its hosted Events, are indicated by the first two initials in the labels of the Campaigns and  Events-- e.g. PS122/1 is the first (indicated by the '/1') Campaign (or \"Leg\") of the 122'nd voyage of the Polarstern (Basis). PS122/2 would be the second Campaign of the 122'nd voyage of the Polarstern, etc.  Again-- these \"Campaigns\" are irregularly referred to elsewhere as \"Legs\" or \"Cruises\"-- with their own unique Cruise numbers.\n\nCampaigns \"host\" numerous Events, meaning those Events occurred during that Campaign.  Events bear cryptic labels that map to distinct Sensor or Sampling efforts that result in the collection of data. Events extend on the naming scheme for Campaigns, with a numbering system that appears to be temporally sequenced. For example, Event PS122/1_5-10 probably commenced before PS122/1_5-100. (Note that this temporal information is not captured in the Ontology as of version 1.001.)\n\nEach Event results in the collection of data by some Sensor or Sampling Device. These \"Methods and Devices\" are organized under the \"Method/Device\" hierarchy in the MOSAiC Ontology, that includes the names of each type of  Sensor/Sampling device as SubClasses. Each Method/Device SubClass can further have one to several more specifically named devices that perform that type of Measurement. For example, the \"Acoustic Doppler Current Profiler Device\" SubClass contains 6 specific types of ADCP sensor instruments, modeled as Instances with their \"Device Long Name\".\n\nThe MOSAiC team associated each Sensor/Sampling device with a \"Short Name\" as well. Most \"Long Name\"  Sensor/Samplers have a only single \"Short Name\" associated with them (sometimes identical to the \"Long Name\"), but this is not always the case. For example, Device with \"Long Name\" = \"particle size magnifer\" has two Device \"Short Names\" associated with it-- \"PSM_UHEL1\" and \"PSM_UHEL2\".\n\nMethod/Devices are associated with Events via their \"Device Long Name\", through the predicate \"has deployment\".  Details about inverse and equivalent properties that are represented in the Ontology, providing other potential ways to discover connections, are too detailed to discuss here. Consult the Ontology.\n\nAssociated with each \"Short Name\" in the Ontology is a URI on the Alfred Wegener Institue website, pointing to further detailed information about that Sensor, e.g. this one for one of the ADCP devices:\n\n\nA Dataset is the outcome/output of an Event. Every dataset contains measurements that can be linked to some Event that used some Device (with most detailed description provided by the URI assocated with its \"Short Name\"); and is from some Campaign, that was performed on some Basis with some Chief Scientist(s), and some Research Location.\n\n(For demonstration purposes, this Ontology contains a single reference to a test dataset instance, findable by searching for \"urn\" -- where these assocations can be seen.  We could link such datasets to specific Events, but since the Events are manifested at level of measurement values in the dataset, these associations are made evident through the Arctic Data Center search portal:, where the PROVO \"wasGeneratedBy\" predicate is used.)\n\n\nComments or questions to Mark Schildhauer ("@en)
- AnnotationAssertion(rdfs:label <> "MOSAiC"@en)
- AnnotationAssertion(rdfs:label <> "Multidisciplinary drifting Observatory for the Study of Arctic Climate"@en)
- ClassAssertion(<> <>)
- ClassAssertion(<> <>)
- Declaration(NamedIndividual(<>))
- Declaration(NamedIndividual(<>))
- OntologyID(OntologyIRI(<>) VersionIRI(<>))
- ReflexiveObjectProperty(owl:topObjectProperty)
- SameIndividual(<> <> )
- SameIndividual(<> <> )
- SymmetricObjectProperty(owl:topObjectProperty)
- TransitiveObjectProperty(owl:topObjectProperty)

26 axioms in right ontology but not in left ontology:
+ Annotation(rdfs:comment "Developed with Protege 5.5.0\nInferences using Pellet 2.2.0\n\nlast updated 02SEP2021 mps"^^xsd:string)
+ AnnotationAssertion(<> <> <>)
+ AnnotationAssertion(<> <> <>)
+ AnnotationAssertion(<> <> <>)
+ AnnotationAssertion(<> <> "2021-05-05T06:11:27Z"^^xsd:dateTime)
+ AnnotationAssertion(<> <> "2021-05-05T06:12:04Z"^^xsd:dateTime)
+ AnnotationAssertion(<> <> "2021-09-01T23:17:48Z"^^xsd:dateTime)
+ AnnotationAssertion(<> <> "The \"_MOSAiC Specific Term\" class is a container to organize and isolate the terms that are most frequently used to describe the various components of the MOSAiC Expedition.\n\nSince  other established ontologies are imported into this one, a number of extraneous, non-MOSAiC relevant, and unused terms may be present and clutter the presentation.  Hopefully, simply presenting this one Class for reference will enable users to enjoy the main advantages of using and exploring this Ontology.\n\nDefinitions or descriptions of these term as described on the Pangaea website are provided in the Annotation fields associated with each MOSAIC term.\n\nThe main patterns semantically modeled here are as such:\n\nThere are 9 Campaigns, which correspond most closely to a \"Cruise\", or \"Leg\" although campaigns can involve stationary or aerial platforms.\n\nEach Campaign has a Basis, which is typically a Research Vessel (e.g. the Polarstern) or Aircraft (Polar 5 & 6).\n\nEach Campaign has one or more Chief Scientists, and a Research Location.\n\nThe Basis of a Campaign, and its hosted Events, are indicated by the first two initials in the labels of the Campaigns and  Events-- e.g. PS122/1 is the first (indicated by the '/1') Campaign (or \"Leg\") of the 122'nd voyage of the Polarstern (Basis). PS122/2 would be the second Campaign of the 122'nd voyage of the Polarstern, etc.  Again-- these \"Campaigns\" are irregularly referred to elsewhere as \"Legs\" or \"Cruises\"-- with their own unique Cruise numbers.\n\nCampaigns \"host\" numerous Events, meaning those Events occurred during that Campaign.  Events bear cryptic labels that map to distinct Sensor or Sampling efforts that result in the collection of data. Events extend on the naming scheme for Campaigns, with a numbering system that appears to be temporally sequenced. For example, Event PS122/1_5-10 probably commenced before PS122/1_5-100. (Note that this temporal information is not captured in the Ontology as of version 1.001.)\n\nEach Event results in the collection of data by some Sensor or Sampling Device. These \"Methods and Devices\" are organized under the \"Method/Device\" hierarchy in the MOSAiC Ontology, that includes the names of each type of  Sensor/Sampling device as SubClasses. Each Method/Device SubClass can further have one to several more specifically named devices that perform that type of Measurement. For example, the \"Acoustic Doppler Current Profiler Device\" SubClass contains 6 specific types of ADCP sensor instruments, modeled as Instances with their \"Device Long Name\".\n\nThe MOSAiC team associated each Sensor/Sampling device with a \"Short Name\" as well. Most \"Long Name\"  Sensor/Samplers have a only single \"Short Name\" associated with them (sometimes identical to the \"Long Name\"), but this is not always the case. For example, Device with \"Long Name\" = \"particle size magnifer\" has two Device \"Short Names\" associated with it-- \"PSM_UHEL1\" and \"PSM_UHEL2\".\n\nMethod/Devices are associated with Events via their \"Device Long Name\", through the predicate \"has deployment\".  Details about inverse and equivalent properties that are represented in the Ontology, providing other potential ways to discover connections, are too detailed to discuss here. Consult the Ontology.\n\nAssociated with each \"Short Name\" in the Ontology is a URI on the Alfred Wegener Institue website, pointing to further detailed information about that Sensor, e.g. this one for one of the ADCP devices:\n\n\nA Dataset is the outcome/output of an Event. Every dataset contains measurements that can be linked to some Event that used some Device (with most detailed description provided by the URI assocated with its \"Short Name\"); and is from some Campaign, that was performed on some Basis with some Chief Scientist(s), and some Research Location.\n\n(For demonstration purposes, this Ontology contains a single reference to a test dataset instance, findable by searching for \"urn\" -- where these assocations can be seen.  We could link such datasets to specific Events, but since the Events are manifested at level of measurement values in the dataset, these associations are made evident through the Arctic Data Center search portal:, where the PROVO \"wasGeneratedBy\" predicate is used.)\n\n\nComments or questions to Mark Schildhauer (schild"@en)
+ AnnotationAssertion(rdfs:label <> "MOSAiC"@en)
+ AnnotationAssertion(rdfs:label <> "Multidisciplinary drifting Observatory for the Study of Arctic Climate"@en)
+ AnnotationAssertion(rdfs:label <> "isBasisFor"@en)
+ ClassAssertion(<> <>)
+ ClassAssertion(<> <>)
+ Declaration(NamedIndividual(<>))
+ Declaration(NamedIndividual(<>))
+ Declaration(ObjectProperty(<>))
+ InverseObjectProperties(<> <>)
+ ObjectPropertyAssertion(<> <> <>)
+ ObjectPropertyAssertion(<> <> <>)
+ ObjectPropertyAssertion(<> <> <>)
+ ObjectPropertyAssertion(<> <> <>)
+ ObjectPropertyAssertion(<> <> <>)
+ ObjectPropertyDomain(<> <>)
+ OntologyID(OntologyIRI(<>) VersionIRI(<>))
+ SameIndividual(<> <> )
+ SameIndividual(<> <> )
amoeba commented 3 years ago

I looked over the above diff and found one issue: The new inverse property isBasisFor was minted from the old HTTP namespace prefix and the IRI collides with an existing IRI. I'll touch base with @mpsaloha about it and get a new IRI.

I also translated the above diff into a more easily digestible form which we can put with the tagged release on GitHub:




amoeba commented 3 years ago

Left a message for @mpsaloha on Slack. I think the next free IRI is

; grep -Eo "(\d+)" MOSAIC_raw-next.owl | sort | uniq | tail <----- After this one
amoeba commented 3 years ago

We decided on and everything else looks in order. I'll work on exporting an inferred copy and doing various release steps tomorrow.

amoeba commented 3 years ago

Raw and inferred versions of MOSAiC 1.0.1 in as of