ESIPFed / sweet

Official repository for Semantic Web for Earth and Environmental Terminology (SWEET) Ontologies
Other
115 stars 33 forks source link

Use wikidata to provide skos:definition for each owl:Class #200

Closed lewismc closed 4 years ago

lewismc commented 4 years ago

Hi folks, this is an updated attempt which extract's Wikidata schema:description's and map them to rdfs:comment's using OWLAPI instead of Jena write the data.

I am looking for feedback here. I know the way that I've structured the annotations is not the way we want to do it but this gives us an idea of how useful the code I wrote by evaluating the results.

The initial results confirm that there are now 2077 occurences of rdfs:comment ... this is not bad (assuming that they make sense)

cat * | grep -c "rdfs:comment"
cat: output: Is a directory
2077

ISSUES

  1. OWLAPI is screwing up the base prefix IRI for each file. I'm going to look into how I can prevent that.
  2. OWLAPI seems to hate using pefixes... this means that the prefix work we did a while back is not really being used IN SWEET anymore. I don't really like this and will try to address this as well.
  3. The owl:versionIRI <Optional[http://sweetontology.net/human]/3.6.0> ; is an issue. This should be owl:versionIRI <http://sweetontology.net/human/3.6.0> ;. I'll work on fixing that.

QUESTION How do we review the new rdfs:comment annotations to ensure that the make logical sense? Some options

  1. split each one into a separate PR... meaning 2077 PR's!!!
  2. Split each file into a separate PR... meaning some >225 PR's!!!
  3. Have folks sit for hours going through this huge PR and flagging any issues they see. Maybe we could split this work up...

What do you guys think?

Thanks

dr-shorthair commented 4 years ago

I agree that - in the short term at least - Wikidata probably provides a good, succinct, source of definitions. But reviewing every text is unrealistic on any useful timescale, and might trigger discussions that may belong better over in ENVO.

So I'd suggest looking at a way to adopt the WIkidata definitions transparently. i.e. either

Then other definitions could be added alongside, which would reflect the rough/contested semantics that we are aiming for in SWEET?

I recognise that merely linking does not achieve the goal of getting a local text definition included, but maybe @carueda could do some magic in COR to fetch schema:description values from the link for display purposes?

lewismc commented 4 years ago

Thanks @dr-shorthair

I agree that - in the short term at least - Wikidata probably provides a good, succinct, source of definitions.

+1

But reviewing every text is unrealistic on any useful timescale, and might trigger discussions that may belong better over in ENVO.

+1

copy the descriptions over but into an annotation structure that allows the provenance to be recorded just add a suitable SKOS mapping link skos:exactMatch or skos:closeMatch etc to the Wikidata entry

We have examples of the following

###  http://sweetontology.net/realmCryo/AlpineTundra
soreac:AlpineTundra rdf:type owl:Class ;
                  rdfs:subClassOf soreac:Tundra ;
                  rdfs:label "alpine tundra"@en ;
                  skos:closeMatch <http://purl.obolibrary.org/obo/ENVO_01001371> ;
                  skos:definition  [
                        rdfs:comment  "A tundra ecosystem which exists at high altitudes and where vegetation is stunted due to low temperatures and high winds."@en ;
                        dcterms:source <https://orcid.org/0000-0003-4808-4736> ;
                        dcterms:created "2019-12-10T06:11:13-08:00Z"^^xsd:dateTimeStamp ;
                        dcterms:creator <https://orcid.org/0000-0003-4091-6059> ;
                        prov:wasDerivedFrom <http://purl.obolibrary.org/obo/ENVO_01001371> ;
                      ] .

wdyt?

dr-shorthair commented 4 years ago

Yes - that is pretty much the direction I was looking for.

lewismc commented 4 years ago

Excellent. I'll go ahead and implement.