Open hmuniz opened 5 years ago
@dr-shorthair @smrgeoinfo this one's for you two.
A lot of relationships have not been included explicitly.
isc:Wuchiapingian
time:intervalMetBy isc:Capitanian .
is implied by the inverse relationship
isc:Capitanian
time:intervalMeets isc:Wuchiapingian .
which is present in the data. However, I have found a different error. Currently
isc:Guadalupian
time:intervalContains isc:Capitanian .
which should actually be
isc:Guadalupian
time:intervalEndedBy isc:Capitanian .
Overall there are a lot of relationships that are not included. They could be added by running a whole lot of rules such as
'if A is met by B, and B is ended by C, then A is met by C' 'if D and E have the same end, and the beginning of E is after the beginning of D then E ends D' ... etc
These could be implemented as SPARQL CONSTRUCT or INSERT queries, but have not been designed yet. @hmuniz would you be interested in working through some of these?
Hi @dr-shorthair , interesting suggestions. I can talk to my students to see how can we collaborate. We have recently wrote about an implementation of the same Ontology in SUMO.
Extending SUMO to Geological Times - Alexandre Tessarollo, Henrique Muniz, Alexandre Rademaker and Adam Pease
The question is if it makes sense to add explicitly the relations that could be inferred by the ontologies declarations. For instance, TIME ontology (https://www.w3.org/2006/time#) already has the definition:
:intervalMeets
rdf:type owl:ObjectProperty ;
rdfs:domain :ProperInterval ;
rdfs:label "interval meets"@en ;
rdfs:range :ProperInterval ;
owl:inverseOf :intervalMetBy ;
The declarations of the time:numericPosition
in the gts:NumericEraBoundary
as annotation properties makes no reasoning possible over these statements.
We don't have any explicit encoding of the approximation mark ~
in the chart (e.g. see BaseNorianTime).
Dear @dr-shorthair, I wonder if we could separate the OWL files from the rest of the website files, that is, creating another repository for the ontologies. It would make easier the collaborative maintenance of the ontologies.
The declarations of the time:numericPosition in the gts:NumericEraBoundary as annotation properties
I don't understand why you are saying this. The time:numericPosition
is defined in OWL-Time and is a Datatype Property.
We don't have any explicit encoding of the approximation mark ~ in the chart (e.g. see BaseNorianTime).
Correct. We did not attempt to encode "~".
creating another repository for the ontologies.
The ontologies are here: https://github.com/GeoscienceAustralia/geosciml.org/tree/master/resource/static/ontology/timescale
Yes, this is in the same repository as the instances https://github.com/GeoscienceAustralia/geosciml.org/tree/master/resource/static/vocabulary/timescale
But it is straightforward to fork the repository and then make pull requests back to the original. This repository is in the GeoscienceAustralia organization. The structure is rather complex, for historical reasons. However, I am not the organization or repo owner.
Indeed, it is very strange how Protege is handling the properties. I am trying to understand why time:numericPosition
is interpreted as annotation property but iso-trs:TM_OrdinalEra.begin
as a data property, both declared in the same way. See the screenshot (the use of TM_OrdinalEra.begin
was only for testing):
But the most important thing is that if you change the value of time:numericPosition
of the BaseAlbianTime
to, let us say, 200, the reasoner will not sign any problem! This makes the model more vulnerable for inconsistencies.
Regarding the repository structure, IMHO it is not so straightforward to contribute for the ontologies in the given structure. First, in the practical side, the repository has 861MB, but ontologies alone have only 41MB
$ find . \( -name '*.rdf' -o -name '*.ttl' -o -name '*.owl' \) -exec du -ch {} +
...
388K ./resource/static/vocabulary/cgi/current/IUGS_CGI_register__for_INSPIRE_lithology_with_content.rdf
20K ./resource/static/vocabulary/cgi/current/faultmovementsense.rdf
41M total
Second, as you said, the structure is not simple and we have to work in a complicated directory structure instead of having a more clear organization focus on the ontologies and their relations and versions. Moreover, I saw that you created a copy of the isc2018.ttl
to isc2018-1.ttl
instead of change the isc2018.ttl
itself. The repository should work for versioning, but now we have versions and copies and a proliferation of files in the repository. Anyway, as I said, I believe if the ontologies are kept separated, it would make collaboration easier. But surely this is only a suggestion from someone interested in contribute. As far as I understood, this repository currently is the source of the website http://geosciml.org. right?
But the most important thing is that if you change the value of time:numericPosition of the BaseAlbianTime to, let us say, 200, the reasoner will not sign any problem! This makes the model more vulnerable for inconsistencies.
I don't understand what you are saying here.
Second, as you said, the structure is not simple and we have to work in a complicated directory structure instead of having a more clear organization focus on the ontologies and their relations and versions.
Indeed. The project has been underway for more than 10 years, and was originally hosted in a SVN repository. So some of the structure and versioning principles were just carried over from that. And also, until now, I was doing most of the editing and analysis alone so it didn't matter much. And it was convenient to leave it all in one repo since (as you surmised) that is used to build the geosciml.org website.
But if you guys are now truly interested in contributing then it probably would be helpful to do a refactor. We should probably maintain the ontology in a separate repo, and then just merge that into the build repo from time to time.
On the matter of the proliferation of versions in separate files - there are two motivations for this:
But I agree that the 2018.ttl and 2018-1.ttl probably should be squashed now. I was treading carefully as the Geosciml people were a bit sensitive about some of the changes I was proposing. However, in the end I wound back the more radical changes, so it would be possible to squash it now, but I didn't get round to it yet. (This is not my day job...)
Note that the ontology was originally built on top of the ISO 19108 model, as implemented in https://github.com/ISO-TC211/GOM/tree/master/isotc211_GOM_harmonizedOntology/iso19108/2006 the structure of which is defined in ISO 19150-2. This is the ontology that is described in our 2015 paper https://doi.org/10.1007/s12145-014-0170-6
However, I subsequently was involved in updating OWL-Time to accommodate the needs of non-Gregorian calendars, so it is now possible to use OWL-TIme instead of ISO 19108. So first I removed the dependencies on 19018 from https://github.com/GeoscienceAustralia/geosciml.org/blob/master/resource/static/ontology/timescale/thors.ttl and https://github.com/GeoscienceAustralia/geosciml.org/blob/master/resource/static/ontology/timescale/gts.ttl Then I built W3C and ISO versions that import the neutral ontology and link it to either OWL-TIme or ISO 19108 here https://github.com/GeoscienceAustralia/geosciml.org/tree/master/resource/static/ontology/timescale/gts
It is all a bit of a tangled web, but like I said no-one else was much interested until now.
I've been going through the isc2018-1.ttl, and found some inconsistencies:
I've a pull request addressing such points.
But the most important thing is that if you change the value of time:numericPosition of the BaseAlbianTime to, let us say, 200, the reasoner will not sign any problem! This makes the model more vulnerable for inconsistencies.
I don't understand what you are saying here.
BaseAlbianTime has (line 782) "time:numericPosition 113.0", which is perfectly consistent with Albian being right after Aptian. But should one accidentally write, say, "time:numericPosition 1130", a reasoner would not sign any problem, despite Aptian ("time:numericPosition 125.0") being right before Albian.
Off hand it seems that these kind of validation criteria could be set up with SWRL rules. It's not clear to me that they can be implemented with OWL constraints. Something like: if (isc:x is time:intervalMetBy isc:y) then (isc:x/time:hasBeginning//time:numericPosition) > (isc:y/time:hasBeginning//time:numericPosition).
specific example: if (isc:Roadian is time:intervalMetBy isc:Wordian) then (isc:Wordian/time:hasBeginning isc:BaseWordian//time:numericPosition) > (isc:Rodian/time:hasBeginning/isc:BaseRodian//time:numericPosition), as suggested above
I'll start a new issue on adding such validation rules. See #7
https://github.com/GeoscienceAustralia/geosciml.org/blob/0c32100f9cb2a167f0ef623264616bd069062274/resource/static/vocabulary/timescale/isc2018-1.ttl#L13898-L13941
Is It not missing the information
time:intervalMetBy isc:Capitanian
?