ESIPFed / sweet

Official repository for Semantic Web for Earth and Environmental Terminology (SWEET) Ontologies
Other
112 stars 33 forks source link

added rdfs:comment(s) from v2.3 #246

Closed brandonnodnarb closed 3 years ago

brandonnodnarb commented 3 years ago

added rdfs:comment "Horticulture is the art and science of the cultivation of plants"@en ;

this is a one off to get the process right.

dr-shorthair commented 3 years ago

"You can lead a horticulture, but you can't make her think." - Dorothy Parker https://www.phrases.org.uk/meanings/418100.html

lewismc commented 3 years ago

@brandonnodnarb this is the result of executing your SPARQL query correct? Can you link it here please?

lewismc commented 3 years ago

We need to make a decision on whether the definitions are to be structured as blank nodes (as is done with all other current definitions) or as a specified annotation property (which seems more like what you have done in this PR). You also mentioned creating a mapping file @brandonnodnarb however unless this is a one-time thing then I feel the mapping file approach may involve sustained overhead. Any thoughts @brandonnodnarb ? Thanks

brandonnodnarb commented 3 years ago

@lewismc yes and no. :)

I did use a SPARQL query but it was a breathtakingly simple query to print all SWEET URIs with their associated rdfs:commentvalues to a file as statements. I ran the query on a local copy of v2.3 as I wasn't sure how to query across versions via the web service(s) --- and knew I could do it this way fairly quickly. Once the statements were isolated to a separate file, I did a post-processing step to swap all the namespaces to current and add the relevant ttl syntax, fix spacing, etc.

A file now exists with turtle statements for each URI with the associated rdfs:comment value. The first 11 lines of the file are:

sohua:Horticulture rdfs:comment "Horticulture is the art and science of the cultivation of plants"@en .

sohud:LocationAllocation rdfs:comment "Spatial allocation is primarily concerned with designating what kinds of activities can or will be done where on the landscape. Land-use zoning is a typical example of a spatial allocation problem in which the landscape is divided up into a set of multiple alternative uses such as industrial, commercial, residential, etc. Allocation to a particular use usually depends on intrinsic properties of the individual parcels as well as adjacency constraints."@en .

sohud:ResourceAllocation rdfs:comment "Resource allocation has two meanings. One meaning refers to allocating a resource such as forest land to two or more designated uses. For example, forest land units could be allocated to timber production, recreation, etc. The second meaning is in the sense of allocating management resources. This second meaning is concerned with allocating time, materials, personnel, budget to landscape elements to accomplish management objectives such as protection, restoration, timber production, etc."@en .

sohueccont:PrimaryTreatment rdfs:comment "In wastewater treatment, a combination of step processes, usually physical in nature, that are designed to remove floating and settleable solids. Examples of process steps are screening and sedimentation."@en .

sohueccont:SecondaryTreatment rdfs:comment "In wastewater treatment, a combination of step processes, usually biochemical in nature, that are designed to remove primarily organic material. Examples of process steps are aeration and trickling filters."@en .

sohueccont:TertiaryTreatment rdfs:comment "Post-secondary treatment of wastewater designed to improve the quality of the water to the point where it can be put to a particular beneficial use. Generally, tertiary treatment steps remove nutrients (e.g., nitrogen and phosphorus) which are poorly removed by secondary treatment. Commonly used steps include coagulation and clarification."@en .

I posted the first one in this PR to start the discussion on how and where these should go.

As we just discussed, three options are relevant. The first involving a blank node, as per the pattern for multiple definitions. An alternative would be creating a custom annotation property to ensure the v2.3 comments are notated explicitly. Something like:

sweet:v23comment rdf:type owl:AnnotationProperty .

with actual values being:

sohua:Horticulture rdf:type owl:Class ;
                  rdfs:subClassOf sohua:Agriculture ;
                  sweet:v23comment "Horticulture is the art and science of the cultivation of plants"@en ;
                  rdfs:label "horticulture"@en .

Another option is simply creating a separate file which one would need to load explicitly to see the 'old' comments. I would think this would be put in with the mapping files and, as v2.3 is no longer relevant, could just sit there without needing a ton of editing :)

Again, the idea behind bringing these up is mostly to use them to help automate generating a Semantic Indicator Value, or SIV if you like :), with relation to other vocabulary effots like GCMD, Wikidata, USGS Thesaurus, etc., with the idea these old rdf:comment values would be replaced with updated and properly cited values.

I'm inclined to create the mapping file as that seems to make the most sense, and should need much upkeep once we have it. It also really only entails making a valid header and import statement(s) and it's good to go.

Thoughts? Preferences?

lewismc commented 3 years ago

This is brilliant @brandonnodnarb thank you for the detail. +1 for mapping file. What kind of work is involved in mapping from your .txt file (or CSV or ...) to the mapping file?

brandonnodnarb commented 3 years ago

@lewismc Each line should be a valid ttl statement. If I add a proper header, including namespaces and the import of sweetAll.ttl we should be able to load that singular file and import all of SWEET with these comments included.

I will investigate and post back.

brandonnodnarb commented 3 years ago

Updating this PR. I removed the horticulture example and added three files:

  1. sweet_v23Comments.ttl contains all the rdfs:comment pulled from version 2.3
  2. sweetAll_includeV23Comments.ttl is a separate sweetAll file with an additional line to load sweet_v23Comments.ttl
  3. edited catalog-v001.xml with additional line to load sweet_v23Comments.ttl

As it's currently constructed the rdfs:comment tags from v2.3 will not load if using sweetAll.ttl or any other ttl file. However, if one loads sweetAll_includeV23Comments.ttl instead of sweetAll.ttl then the comments from v2.3, which are in sweet_v23Comments.ttl, will be processed and loaded with the other files.

For convenience I also edited the catalog-v001.xml to point to the local sweet_v23Comments.ttl file as with the others.

I originally thought these could go in the mappings directory, but this also makes sense to me.

@lewismc are you able to sanity check? Thoughts?

lewismc commented 3 years ago

@ESIPFed/semtech can you guys please review?