ESIPFed / sweet

Official repository for Semantic Web for Earth and Environmental Terminology (SWEET) Ontologies
Other
115 stars 33 forks source link

Determine number of instances in SWEET #209

Open wdduncan opened 4 years ago

wdduncan commented 4 years ago

I think would be useful to know how much instance level data each owl:Class in SWEET has. This could help in prioritizing/inform which classes to focus on, and what kinds of analysis would be most fruitful.

The spraql is straightforward:

select (count (distinct ?instance) as ?instance_count) ?class ?class_label
where 
{
  ?class a owl:Class .
  optional { ?class rdfs:label ?class_label } . # do all SWEET classes have labels?
  ?i a ?class .

  # sometimes the triple store creates blanks nodes you don't want
  # but I don't know if you gave your instanced IRIs or if you simply used blank nodes for them
  filter (!bnode(?i)) 
}

If you want to break out the counts by named graphs, you'll need to need to modify the sparql accordingly. E.g.:

graph ?g {
  ?class a owl:Class . optional { ?class rdfs:label ?class_label } . 
  ?i a ?class .
}
lewismc commented 4 years ago

@wdduncan can you post a sample result? I would have through you would have been interested in owl:NamedIndividual's instead of owl:Classes here?

wdduncan commented 4 years ago

@lewismc What is the URL of your SPARQL endpoint?

lewismc commented 4 years ago

@wdduncan http://cor.esipfed.org/ont/sparql

wdduncan commented 4 years ago

Here's an example result+WHERE+%7B%0A++%3Fclass+a+owl%3AClass.%0A++optional+%7B%0A++++%3Fclass+rdfs%3Alabel+%3Fclass_label%0A++%7D+.%0A++optional+%7B%0A++++%3Fchild+rdfs%3AsubClassOf+%3Fclass%0A++%7D+.%0A++%3Fi+a+%3Fclass%3B%0A+++++a+owl%3ANamedIndividual+.%0A++filter(!bound(%3Fchild))+%23+only+return+the+leaf+classes%0A++%23+filter(!isblank(%3Fi))%0A%7D+%0Agroup+by+%3Fclass+%3Fclass_label%0Aorder+by+desc(%3Finstance_count)%0A%23+LIMIT+10&contentTypeConstruct=application%2Frdf%2Bxml&contentTypeSelect=application%2Fsparql-results%2Bjson&endpoint=http%3A%2F%2Fcor.esipfed.org%2Fsparql&requestMethod=POST&tabTitle=Query&headers=%7B%7D&outputFormat=table)

Looks you have a lot of dams :)

lewismc commented 4 years ago

Here's an example result

Nice query, when we update YASGUI I'll make this query one of the examples.

Looks you have a lot of dams :)

Yes that is a pretty rich dataset and can be visualized nicely through YASGUI's Geo feature.

As we discussed offline, the owl:NamedIndividual's are scattered throughout the ontology suite. The query above has highlighted the extent... which is great.

I think that there may be enough interest to pursue an investigation of the 98 owl:NamedInvidual's associated with http://sweetontology.net/stateTime/Age however we would need to define what it is that we are interested in delivering. Is the goal here to see what SWEET owl:NamedInvidual's compliment/harmonize with another resource? Like ENVO?

wdduncan commented 4 years ago

EnvO doesn't have terms for expressing geologic age ... perhaps this could be an area of collaboration that would benefit EnvO.
My main thought was to enrich SWEET by using EnvO semantics to facilitate searching for instances (owl:NamedIndividuals). E.g., find me all samples (instances / owl:NamedIndividuals) that have part some methane and located in some aquatic ecosystem.

Make sense?

rrovetto commented 4 years ago

EnvO doesn't have terms for expressing geologic age ... perhaps this could be an area of collaboration that would benefit EnvO. My main thought was to enrich SWEET by using EnvO semantics to facilitate searching for instances (owl:NamedIndividuals). E.g., find me all samples (instances / owl:NamedIndividuals) that have part some methane and located in some aquatic ecosystem.

Make sense?

We would not want to commit to or impose the semantics, and associated ontological commitments or semantic resources, of another resource (EnvO or otherwise) on SWEET.

wdduncan commented 4 years ago

As far I can tell SWEET doesn't have strong semantics. In any case, you can classify individuals multiple ways. The goal is to enrich semantic search. You can do this with SKOS matches if you dislike the OWL.

lewismc commented 4 years ago

@rrovetto

We would not want to commit to or impose the semantics, and associated ontological commitments or semantic resources, of another resource (EnvO or otherwise) on SWEET.

Why not? If axiom's are lacking in SWEET then wehy wouldn't we wish to use external source to enrich it? The same goes for ENVO. This is exactly what the semantic harmonization community have been working on all this time.

As far I can tell SWEET doesn't have strong semantics.

Correct.

The goal is to enrich semantic search.

Correct.

wdduncan commented 4 years ago

On the EnvO slack channel (anyone a part of this?) the suggestion was made to map (e.g., skos:closematch) geologic ages to the wikidata IRIs. I'm not familiar enough with wikidata to know if this is viable, but thought I'd pass along the suggestion.

dr-shorthair commented 4 years ago

EnvO doesn't have terms for expressing geologic age

On this topic, please be aware that we have been maintaining RDF datasets for the Geologic TImescale, covering the various versions issued by the International Commission on Stratigraphy here: https://github.com/CGI-IUGS/timescale-data

This representation uses an ontology documents here: https://github.com/CGI-IUGS/timescale-ont