GeoscienceAustralia / geosciml.org

Creative Commons Attribution 4.0 International
2 stars 5 forks source link

Wuchiapingian definition #3

Open hmuniz opened 5 years ago

hmuniz commented 5 years ago

https://github.com/GeoscienceAustralia/geosciml.org/blob/0c32100f9cb2a167f0ef623264616bd069062274/resource/static/vocabulary/timescale/isc2018-1.ttl#L13898-L13941

Is It not missing the information time:intervalMetBy isc:Capitanian?

nicholascar commented 5 years ago

@dr-shorthair @smrgeoinfo this one's for you two.

dr-shorthair commented 5 years ago

A lot of relationships have not been included explicitly.

isc:Wuchiapingian
    time:intervalMetBy isc:Capitanian .

is implied by the inverse relationship

isc:Capitanian
      time:intervalMeets isc:Wuchiapingian .

which is present in the data. However, I have found a different error. Currently

isc:Guadalupian 
    time:intervalContains isc:Capitanian .

which should actually be

isc:Guadalupian 
   time:intervalEndedBy isc:Capitanian .

Overall there are a lot of relationships that are not included. They could be added by running a whole lot of rules such as

'if A is met by B, and B is ended by C, then A is met by C' 'if D and E have the same end, and the beginning of E is after the beginning of D then E ends D' ... etc

These could be implemented as SPARQL CONSTRUCT or INSERT queries, but have not been designed yet. @hmuniz would you be interested in working through some of these?

arademaker commented 5 years ago

Hi @dr-shorthair , interesting suggestions. I can talk to my students to see how can we collaborate. We have recently wrote about an implementation of the same Ontology in SUMO.

Extending SUMO to Geological Times - Alexandre Tessarollo, Henrique Muniz, Alexandre Rademaker and Adam Pease

The question is if it makes sense to add explicitly the relations that could be inferred by the ontologies declarations. For instance, TIME ontology (https://www.w3.org/2006/time#) already has the definition:

:intervalMeets
  rdf:type owl:ObjectProperty ;
  rdfs:domain :ProperInterval ;
  rdfs:label "interval meets"@en ;
  rdfs:range :ProperInterval ;
  owl:inverseOf :intervalMetBy ;
arademaker commented 5 years ago

The declarations of the time:numericPosition in the gts:NumericEraBoundary as annotation properties makes no reasoning possible over these statements.

We don't have any explicit encoding of the approximation mark ~ in the chart (e.g. see BaseNorianTime).

arademaker commented 5 years ago

Dear @dr-shorthair, I wonder if we could separate the OWL files from the rest of the website files, that is, creating another repository for the ontologies. It would make easier the collaborative maintenance of the ontologies.

dr-shorthair commented 5 years ago

The declarations of the time:numericPosition in the gts:NumericEraBoundary as annotation properties

I don't understand why you are saying this. The time:numericPosition is defined in OWL-Time and is a Datatype Property.

We don't have any explicit encoding of the approximation mark ~ in the chart (e.g. see BaseNorianTime).

Correct. We did not attempt to encode "~".

creating another repository for the ontologies.

The ontologies are here: https://github.com/GeoscienceAustralia/geosciml.org/tree/master/resource/static/ontology/timescale

Yes, this is in the same repository as the instances https://github.com/GeoscienceAustralia/geosciml.org/tree/master/resource/static/vocabulary/timescale

But it is straightforward to fork the repository and then make pull requests back to the original. This repository is in the GeoscienceAustralia organization. The structure is rather complex, for historical reasons. However, I am not the organization or repo owner.

arademaker commented 5 years ago

Indeed, it is very strange how Protege is handling the properties. I am trying to understand why time:numericPosition is interpreted as annotation property but iso-trs:TM_OrdinalEra.begin as a data property, both declared in the same way. See the screenshot (the use of TM_OrdinalEra.begin was only for testing):

image
arademaker commented 5 years ago

But the most important thing is that if you change the value of time:numericPosition of the BaseAlbianTime to, let us say, 200, the reasoner will not sign any problem! This makes the model more vulnerable for inconsistencies.

Regarding the repository structure, IMHO it is not so straightforward to contribute for the ontologies in the given structure. First, in the practical side, the repository has 861MB, but ontologies alone have only 41MB

$ find . \( -name '*.rdf' -o -name '*.ttl' -o -name '*.owl' \) -exec du -ch {} +
...
388K    ./resource/static/vocabulary/cgi/current/IUGS_CGI_register__for_INSPIRE_lithology_with_content.rdf
 20K    ./resource/static/vocabulary/cgi/current/faultmovementsense.rdf
 41M    total

Second, as you said, the structure is not simple and we have to work in a complicated directory structure instead of having a more clear organization focus on the ontologies and their relations and versions. Moreover, I saw that you created a copy of the isc2018.ttl to isc2018-1.ttl instead of change the isc2018.ttl itself. The repository should work for versioning, but now we have versions and copies and a proliferation of files in the repository. Anyway, as I said, I believe if the ontologies are kept separated, it would make collaboration easier. But surely this is only a suggestion from someone interested in contribute. As far as I understood, this repository currently is the source of the website http://geosciml.org. right?

dr-shorthair commented 5 years ago

But the most important thing is that if you change the value of time:numericPosition of the BaseAlbianTime to, let us say, 200, the reasoner will not sign any problem! This makes the model more vulnerable for inconsistencies.

I don't understand what you are saying here.

Second, as you said, the structure is not simple and we have to work in a complicated directory structure instead of having a more clear organization focus on the ontologies and their relations and versions.

Indeed. The project has been underway for more than 10 years, and was originally hosted in a SVN repository. So some of the structure and versioning principles were just carried over from that. And also, until now, I was doing most of the editing and analysis alone so it didn't matter much. And it was convenient to leave it all in one repo since (as you surmised) that is used to build the geosciml.org website.

But if you guys are now truly interested in contributing then it probably would be helpful to do a refactor. We should probably maintain the ontology in a separate repo, and then just merge that into the build repo from time to time.

On the matter of the proliferation of versions in separate files - there are two motivations for this:

  1. it was transported over from an earlier SVN repository
  2. the dated versions primarily correspond to the dated versions published by the ICS. We actually think it is important to maintain these separate artefacts to reflect the governance artefacts.

But I agree that the 2018.ttl and 2018-1.ttl probably should be squashed now. I was treading carefully as the Geosciml people were a bit sensitive about some of the changes I was proposing. However, in the end I wound back the more radical changes, so it would be possible to squash it now, but I didn't get round to it yet. (This is not my day job...)

dr-shorthair commented 5 years ago

Note that the ontology was originally built on top of the ISO 19108 model, as implemented in https://github.com/ISO-TC211/GOM/tree/master/isotc211_GOM_harmonizedOntology/iso19108/2006 the structure of which is defined in ISO 19150-2. This is the ontology that is described in our 2015 paper https://doi.org/10.1007/s12145-014-0170-6

However, I subsequently was involved in updating OWL-Time to accommodate the needs of non-Gregorian calendars, so it is now possible to use OWL-TIme instead of ISO 19108. So first I removed the dependencies on 19018 from https://github.com/GeoscienceAustralia/geosciml.org/blob/master/resource/static/ontology/timescale/thors.ttl and https://github.com/GeoscienceAustralia/geosciml.org/blob/master/resource/static/ontology/timescale/gts.ttl Then I built W3C and ISO versions that import the neutral ontology and link it to either OWL-TIme or ISO 19108 here https://github.com/GeoscienceAustralia/geosciml.org/tree/master/resource/static/ontology/timescale/gts

It is all a bit of a tangled web, but like I said no-one else was much interested until now.

alexandretessarollo commented 4 years ago

I've been going through the isc2018-1.ttl, and found some inconsistencies:

I've a pull request addressing such points.

alexandretessarollo commented 4 years ago

But the most important thing is that if you change the value of time:numericPosition of the BaseAlbianTime to, let us say, 200, the reasoner will not sign any problem! This makes the model more vulnerable for inconsistencies.

I don't understand what you are saying here.

BaseAlbianTime has (line 782) "time:numericPosition 113.0", which is perfectly consistent with Albian being right after Aptian. But should one accidentally write, say, "time:numericPosition 1130", a reasoner would not sign any problem, despite Aptian ("time:numericPosition 125.0") being right before Albian.

smrgeoinfo commented 4 years ago

Off hand it seems that these kind of validation criteria could be set up with SWRL rules. It's not clear to me that they can be implemented with OWL constraints. Something like: if (isc:x is time:intervalMetBy isc:y) then (isc:x/time:hasBeginning//time:numericPosition) > (isc:y/time:hasBeginning//time:numericPosition).

specific example: if (isc:Roadian is time:intervalMetBy isc:Wordian) then (isc:Wordian/time:hasBeginning isc:BaseWordian//time:numericPosition) > (isc:Rodian/time:hasBeginning/isc:BaseRodian//time:numericPosition), as suggested above

I'll start a new issue on adding such validation rules. See #7