CGI-IUGS / timescale-data

RDF representation of Geologic Timescale
Creative Commons Zero v1.0 Universal
6 stars 2 forks source link

GSSP Table #3

Open alexandretessarollo opened 4 years ago

alexandretessarollo commented 4 years ago

I understand this work https://doi.org/10.1016/j.epsl.2019.05.004 is the basis for a new GSSP. Is the table at http://stratigraphy.org/GSSP/index.html based on a RDF, XML or is it a straight HTML?

Either way, how to update such table? Maybe it could be embraced by this Git initiative, or we could rely on wikidata format instead.

Any thoughts?

dr-shorthair commented 4 years ago

I don't know how the artefacts on http://stratigraphy.org/ are generated. I suspect by hand. You will note that some of the citation are not hyperlinked to the sources as well.

This GitHub-hosted effort has no formal relationship with the maintainers of that website - we tried a while back but no luck - perhaps we should have another attempt.

alexandretessarollo commented 4 years ago

This comment in #5

The gssp information here https://timescalefoundation.org/gssp/index.php?parentid=all seems to be more updated than here https://stratigraphy.org/gssp/

brings us back to our original base issue here: are we going to use multiple reference sources to update our files or point to just one (or few) "authoritative" resources? Either way, what constitutes a reference that we should base ourselves on? An ICS / IUGS sanctioned material, a peer-reviewed paper, a wikipedia article? And how to resolve eventual conflicts within our set of references?

Maybe we need some guidelines on this aspect. Any thoughts anyone?

dr-shorthair commented 4 years ago

@nicholascar might have some insight (he is the new webmaster at stratigraphy.org). Nick - the issue is that while https://stratigraphy.org/gssps/ is from the official stratigraphy.org website, it appears to be missing some information available from https://timescalefoundation.org/gssp/index.php. And in a few places the description of the boundary-level and correlation-event differs between these two resources. It would be good to have a single point-of-truth and generate all the other artefacts from that. I propose the RDF encoding should serve that purpose ;-)

nicholascar commented 4 years ago

https://stratigraphy.org in total is going through a website refresh. Since the Subcommissions are also changing over now, as they do every 4 years, there's work starting on updating both Subcommission online contact info and their subsites. We've refreshed the Stratigraphic Guid and the style of the GSSP website and have just updated the Quaternary content too and have added in the Interactive Timescale which is based on data, not static HTML.

The entire set of stratigraphy.org sites, except for the Interactive Timescale, are static HTML for now as we move them all to a similar style and update simple content like all the Subcommissions' details but we do plan on making things much more dynamic soon. Please give us a month or two and then we'll perhaps be able to formulate a better approach to the GSSP content.

In the meanwhile, please record requests or just thoughts using the various stratigraphy.org websites repositories at https://github.com/i-c-stratigraphy. Now that we have the possibility for improvements with the stratigraphy.org websites we should perhaps consolidate some of the other sources of stratigraphic information out there in one point-of-truth place (like the 5+ versions of the Guide the have appeared due to the old stratigraphy.org Guide being pretty ugly).

arademaker commented 4 years ago

Thank you @nicholascar for sharing your plans. I definitely support the suggestion from @dr-shorthair to make the website consume the RDF files. In that sense, @alexandretessarollo and I could collaborate with @dr-shorthair in the RDF/OWL files. Do you agree with that?

BTW, does anyone here knows anything about the maintenance of https://timescalefoundation.org. Does anyone know Dr. Gabi Ogg (the name in the footnote in the website)? Maybe he could also be interested in some synchronization with the https://stratigraphy.org website?

dr-shorthair commented 4 years ago

Does anyone know Dr. Gabi Ogg (the name in the footnote in the website)?

Gabi Ogg is/was the spouse and collaborator of Jim Ogg, who were (along with Felix Gradstein and Alan Smith) the primary maintainers of the timescale for a long time - e.g. https://www.sciencedirect.com/book/9780444594259/the-geologic-time-scale AFAICT Gabi took a particular interest in the coloured presentation form. I think they have all retired now. The new team at Stratigraphy.org (of which Nick is a member) are now pulling it back together after a period with the Chinese team.

xgmachina commented 4 years ago

When I shared this presentation https://lnkd.in/gtKnxzU I saw a comment from Kim Cohen https://www.uu.nl/staff/KMCohen/Profile. He is the current leader for generating the ISC charts in PDF format, and he is very interested in the semantic web work. I believe he is still using the TimeScale Creator of Ogg for generating the ISC charts.

xgmachina commented 4 years ago

Recently I was discussing with @dr-shorthair and colleagues in ESIP about documenting version information in the vocabulary schemes that Simon developed. An additional issue I found is the incomplete and/or missing GSSP information among those schemes. By using Simon's ontology patterns, I tried to updated my endpoint (with the version control structure, details here:https://lnkd.in/gtKnxzU) to include the GSSP information in each ISC chart vocabulary up to the version 2018-08. Below is a small test in R to query the endpoint:

# a program to count the number of ratified GSSPs in each of the ISC chart vocabulary scheme

library(SPARQL)

endpoint = "http://virtuoso.nkn.uidaho.edu:8890/sparql/"

# attach SPARQL querry prefix. Note: the graph for our study should be updated
sparql_prefix = "
    prefix tssc: <http://deeptimekb.org/tssc#> 
    prefix tsnc: <http://deeptimekb.org/tsnc#> 
    prefix tswe: <http://deeptimekb.org/tswe#> 
    prefix tsbr: <http://deeptimekb.org/tsbr#>
    prefix tsba: <http://deeptimekb.org/tsba#> 
    prefix tsjp: <http://deeptimekb.org/tsjp#> 
    prefix tsau: <http://deeptimekb.org/tsau#>                   
    prefix tsnc: <http://deeptimekb.org/tsnc#> 
    prefix dc: <http://purl.org/dc/elements/1.1/> 
    prefix dcterms: <http://purl.org/dc/terms/> 
    prefix foaf: <http://xmlns.com/foaf/0.1/> 
    prefix geo: <http://www.opengis.net/ont/geosparql#> 
    prefix gts: <http://resource.geosciml.org/ontology/timescale/gts#> 
    prefix isc: <http://resource.geosciml.org/classifier/ics/ischart/> 
    prefix owl: <http://www.w3.org/2002/07/owl#> 
    prefix rank: <http://resource.geosciml.org/ontology/timescale/rank/> 
    prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
    prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
    prefix samfl: <http://def.seegrid.csiro.au/ontology/om/sam-lite#> 
    prefix sf: <http://www.opengis.net/ont/sf#> 
    prefix skos: <http://www.w3.org/2004/02/skos/core#> 
    prefix sosa: <http://www.w3.org/ns/sosa/> 
    prefix thors: <http://resource.geosciml.org/ontology/timescale/thors#> 
    prefix time: <http://www.w3.org/2006/time#> 
    prefix ts: <http://resource.geosciml.org/vocabulary/timescale/> 
    prefix vann: <http://purl.org/vocab/vann/> 
    prefix void: <http://rdfs.org/ns/void#> 
    prefix xkos: <http://rdf-vocabulary.ddialliance.org/xkos#> 
    prefix xsd: <http://www.w3.org/2001/XMLSchema#>

  "

#get a sorted list of all the ISC chart schemes
q0= paste0(sparql_prefix, '
      SELECT DISTINCT ?sch ?lbl
WHERE
{

   GRAPH <http://deeptimekb.org/iscallnew>
   {   
       ?sch  a skos:ConceptScheme ;
             rdfs:label ?lbl . 
        FILTER(regex(str(?lbl), "International", "i"))
   }

}
ORDER BY DESC (?sch)
')

res0 = SPARQL(endpoint, q0)$results

nSch = length(res0$sch)

for(k in 1:nSch)
{
  print(res0$sch[k])

sparql_code =   

q1 = paste0(sparql_prefix, '
       SELECT COUNT (DISTINCT ?baseSp) AS ?gsspNum
WHERE
{
   GRAPH <http://deeptimekb.org/iscallnew>
   {   
       ?bdry  a gts:GeochronologicBoundary ;
               dc:description
               [
                 gts:stratotype ?baseSp ;
                 skos:inScheme ',  res0$sch[k], 
      '  
               ] .      

       ?baseSp samfl:shape ?spLocation ;  
               gts:ratifiedGSSP ?tf . 
        FILTER(regex(str(?tf), "true", "i"))

       ?spLocation geo:asWKT ?spCoordinates .
   }
}
')
res1 = SPARQL(endpoint, q1)$results

print(res1$gsspNum)
}

The result is this list below. I did not fully verify the numbers with the historical ISC charts yet, but it can prove that we can document the GSSP version information in the knowledge graph.

[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2018-08>"
[1] 72
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2018-07>"
[1] 72
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2017-02>"
[1] 68
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2016-10>"
[1] 68
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2016-04>"
[1] 67
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2015-01>"
[1] 66
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2014-10>"
[1] 65
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2014-02>"
[1] 65
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2013-01>"
[1] 65
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2012-08>"
[1] 64
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2010-09>"
[1] 62
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2009-08>"
[1] 61
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2008-08>"
[1] 60
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2006-04>"
[1] 49
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2005-12>"
[1] 49
[1] "<http://resource.geosciml.org/vocabulary/timescale/isc2004-04>"
[1] 46
xgmachina commented 4 years ago

The comprehensive graph with the ISC chart vocabulary schemes 2004-04 to 2018-08 is accessible here: https://github.com/xgmachina/DeepTimeKB/blob/master/RDF_Code/geotimeversion.ttl . The query in the R example above is applied to this graph in an endpoint.