B5. change domain of gcis:hasSpatialResolution and gcis:hasTemporalResolution properties

zednis commented 9 years ago

Change Proposal	B5
Point of Contact	@justgo129
Date	2015-05-19

Proposal

Broaden or remove domain of gcis:hasSpatialResolution and gcis:hasTemporalResolution properties.

Background

gcis:hasSpatialResolution and gcis:hasTemporalResolution each have domain gcis:modelRun, but we also call SpatialResolution and TemporalResolution for datasets.

gcis:hasSpatialResolution a owl:ObjectProperty ;
    rdfs:label "Has Spatial Resolution" ;
    rdfs:comment "An entity of model run may have a spatial resolution." ;
    rdfs:domain gcis:ModelRun ;
    rdfs:range gcis:SpatialResolution .

gcis:hasTemporalResolution a owl:ObjectProperty ;
    rdfs:label "Has Temporal Resolution" ;
    rdfs:comment "An entity of model run may have a temporal resoltuion." ;
    rdfs:domain gcis:ModelRun ;
    rdfs:range gcis:TemporalResolution .

With the current domain statements a reasoner would infer that any individual that is stated to have spatial or temporal resolution would have type gcis:ModelRun. This inference is not desirable in for all datasets stated to have spatial or temporal resolution.

Does this change affect the semantics of the ontology?

YES

Remove undesirable inference that individuals with spatial or temporal extents have type gcis:ModelRun

Describe in detail the change being requested

We have 3 options:

Remove the domain declaration
Broaden the domain to a class in the gcis ontology [1]
Change the domain to a class from an external ontology (e.g. SWEET, ISO, PROV-ES)
Update the domain to the the unionOf gcis:ModelRun and gcis:Dataset [2]

1 - gcis:ModelRun does not have a super-class from the gcis ontology 2 - Are there any individuals not a model run or dataset that would also have spatial or temporal resolution?

Describe the reason for the change

Remove undesirable type inference on individuals stated to have a temporal or spatial resolution.

What are the anticipated impacts

no anticipated impacts on non-reasoned instance data
individuals with spatial and temporal resolution will no longer be inferred to have type gcis:ModelRun
individuals with spatial and temporal resolution will have a new type inference if we do not decide to drop the domain statement altogether.
What are the risks of not performing the change?
inferences from current ontology incorrectly assign type gcis:ModelRun to dataset individuals
queries and over reasoned instance data for model runs will incorrectly return non-model datasets
faceted searches over reasoned instance data will incorrectly show non-model datasets as model runs
What is the expected technical effort to support the change?

Expected effort to support the change is low.

update domain of 2 properties in ontology
update documentation of 2 properties
verify process to produce instance data RDF from relational database is not assigning incorrect type
Describe alternatives considered

4 options are currently suggested.

Related change proposals

N/A

Notes

N/A

justgo129 commented 9 years ago

I like Option 3 if that's feasible.

xgmachina commented 9 years ago

Can we use owl:unionOf to add gcis:Dataset as a part of the domain of those two properties? For example:

gcis:hasSpatialResolution a owl:ObjectProperty ;
    rdfs:label "Has Spatial Resolution" ;
    rdfs:comment "An entity of model run may have a spatial resolution." ;
    rdfs:domain [ owl:unionOf gcis:ModelRun, gcis:Dataset] ;
    rdfs:range gcis:SpatialResolution .

zednis commented 9 years ago

@xgmachina, that is also good suggestion. I will include it as an option in the proposal.

zednis commented 9 years ago

Regarding the optional to use unionOf (option 4) - are there any instances that would have temporal or spatial resolution and are neither a dataset or model run?

justgo129 commented 9 years ago

Images and figures have lat max, lat min, long max, long min, but that's the closest. @aulenbac , please confirm. Thanks.

justgo129 commented 9 years ago

I thought extensively about the comment by @xgmachina and decided I'd still prefer to go with Option 3. This is because the definitions for the terms to which the union is "applied" may change during the course of making other changes to the ontology. It would just be one less thing to consider and potentially revisit.

zednis commented 9 years ago

We may want to consider using dcat:spatial and dcat:temporal as part of a general adoption of DCAT terms, but I believe spatial resolution and spatial coverage are two different (but related) concepts.

edit - nevermind. see comment below.

justgo129 commented 9 years ago

Right, they are. I just realized that I meant to include the extent discussion somewhere else (my bad). I'm for using dcat:spatial and dcat:temporal if the formatting of their respective entries would work for us. Project Open Data uses it for datasets. Would we need to broaden the use of dcat:spatial and dcat:temporal for other items down the road (I.e. not datasets), though?

zednis commented 9 years ago

No broadening of dcat:spatial and dcat:temporal - those properties are defined for being used specifically in the context of datasets. If we wanted to use something broader we would use dct:spatial and dct:temporal.

edit - after looking at DCAT I see that there is no such property as dcat:spatial and dcat:temporal. DCAT uses dcterms:spatial and dcterms:temporal directly.

justgo129 commented 9 years ago

Would dct:spatial and dcat:temporal accommodate the necessary ranges? For instance, in entries for datasets in GCIS, the field "spatial extent" is defined as a "brief extension of the spatial extent, which corresponds to lat_min/lat_max, lon_min/lon_max ." For datasets, there are also fields for "lat_min," "lat_max," "long_min," and "lat_max" for which entries in exist in the db. Often when entries exist for these fields exist in the db, no entry is provided for "spatial extent." The same applies to "temporal_extent," "start_time," and "end_time."

bduggan commented 9 years ago

Datasets and model runs both have temporal and spatial resolutions, but the concepts are slightly different; the resolutions of model runs are parameters in addition to being attributes. There is currently no instance data for model runs in GCIS but there are some narrative descriptions embedded here and the README files here, and this yaml file.

aulenbac commented 9 years ago

"Regarding the optional to use unionOf (option 4) - are there any instances that would have temporal or spatial resolution and are neither a dataset or model run?"

Reports, report chapters, report sections, report findings, images, figures, datasets, products derived from datasets such as GCIS arrays and tables, models, model runs, model scenarios, model experiments, model ensemble members, model outputs, indicators, field experiments, depending on the platform type potentially platforms and platform/instrument instances, and products derived from them, all have spatial extents, spatial resolutions, temporal extents, and temporal resolutions. These are important and need to be properly incorporated in this version of the ontology.

There are other important considerations as well. For example, there has been quite a bit of discussion over the last two years about representing GCIS entries for images and figures (and ...) in ISO 19115:2003 (and associated standards such as 19115-2, 191139, etc.) or, more likely, ISO 19115:2014.

zednis commented 9 years ago

I don't think it makes sense to say that a report, report chapter, report section, report finding, field experiment, platform, or instrument directly have spatial extents or spatial resolution.

A chapter or report finding may have a region as a topic, but the chapter itself does not have a spatial extent or resolution.

zednis commented 9 years ago

I think spatially-mapped data has temporal resolution. That should cover both sensor data and model output.

If the domain of this property was just Dataset and ModelRun was a subclass of Dataset then I think we would be set.

Which leads me to my next question - why is ModelRun not a subclass of Dataset?

zednis commented 9 years ago

The one potential snag I can see is if for some reason we want to say that a satellite sensor has spatial resolution. A satellite sensor definitely has optical/sensor resolution, but I am not sure it has spatial resolution.

A satellite can have a high-resolution optical sensor but for something to have spatial resolution I think it needs to be spatially mapped - and that only happens once the sensor data has been processed into a dataset.

bduggan commented 9 years ago

On Wednesday, May 27, Stephan Zednik wrote:

Which leads me to my next question - why is ModelRun not a subclass of Dataset?

I'd be okay with this, though I think we are conflating a run and an output of a run. I think these terms are often used interchangably, though, so our definition should reflect that.

Brian

zednis commented 9 years ago

Currently a ModelRun is a subclass of prov:Entity, so I would say it is currently modeled as the output of a run and not as the run itself (which I would think would be modeled as a subclass of prov:Activity).

It definitely seems like there is currently some conflating of the run and the run output.

aulenbac commented 9 years ago

"I don't think it makes sense to say that a report, report chapter, report section, report finding, field experiment, platform, or instrument directly have spatial extents or spatial resolution.

A chapter or report finding may have a region as a topic, but the chapter itself does not have a spatial extent or resolution."

I appreciate your confusion but it does make sense. Reports first. These are scientific reports. Throughout the product life cycle, from initial conception through the analyses to the findings and all the steps and stages between, they are relevant for a specific geographic area for a specific period of interest; spatial extent and temporal extent. If a science team is producing a report on the climate change impacts on livestock production in the Southwest for the next 30 years, the Southwest has to be defined and all the inputs, outputs, analyses, and findings fall within and are applicable to that spatially defined "Southwest". Using a simple region name is insufficient. Different groups have different concepts of what is the Southwest. To some it is the Four Corners region. To others, it is only the states of New Mexico and Arizona. Still others include Colorado and Utah and sometimes California. USGCRP addressed the need for specificity and consistency by providing spatially explicit shapefile collections for each of the regions discussed in the NCA3. The NCA author teams used them. I personally received requests from climate modelers for these spatially explicit boundaries. As for chapters not having a spatial extent or resolution take a look at NCA3 Chapter 20 : Southwest (https://data.globalchange.gov/report/nca3/chapter/southwest). The first page of the chapter clearly shows the spatial extents for the NCA3's definition of the Southwest.

Field experiment? Absolutely has spatial extents. See the Cold Land Processes Field Experiment (CLPX) at https://data.globalchange.gov/dataset/nasa-nsidcdaac-0154 for an example. Note CLPX has specific, lat/lon boundaries defining its spatial extent, not a colloquial, regional name.

Platform? As I said above, it depends on the platform. Some scientists consider an ecological observatory a platform. Each of National Ecological Observatory Network (NEON's) core sites has a spatial extent by definition (http://www.neoninc.org/science-design/spatiotemporal-design). A tall tower, another type of platform, is scientifically defined to represent a specific, generally non-varying, spatial extent. A voluntary observing ship, not so much. A ship is mobile and here spatial extent is tied to area covered by the ship tracks sampled during a given cruise. An airborne platform is similar: information derived from data derived from measurements collected by instruments mounted on the aircraft has, at the minimum, spatial extents defined by the union of the flight tracks comprising a given flight.

If we are discussing this, then we are not doing very well, are we? Is there a better way to express these concepts in the ontology?

justgo129 commented 9 years ago

Are we referring to a chapter as a porion of a written document or to the contents of that document? i.e. the "this is not a pen" example. I'd think one would have spatial extents, the other wouldn't. The GCIS Ontology suggests the former rather than the latter definition.

Steve, I agree with you. I think we should table this discussion until we have our tag up on Tuesday.

justgo129 commented 9 years ago

"Contents of that portion of the document," I mean.

aulenbac commented 9 years ago

We should continue the discussion here. I'm flying Tuesday.

aulenbac commented 9 years ago

Unfortunately, resolution is one of those overloaded terms. To me, sensor resolution is the smallest measurement the sensor can reliably capture. Sensor resolution is usually defined in engineering units. It is tied to the sensor hardware itself. When I think of spatial resolution I think of either the smallest unit data is provided in (e.g, meters, a 0.5 x 0.5 degree box, etc.) or, iff dealing with some type of scientific imagery, the distance between pixels. Spatial resolution is tied to the data or information derived from the platform-instrument-sensor-measurement chain.

bduggan commented 9 years ago

I find @aulenbac's examples convincing: just as the spatial/temporal characteristics for a dataset are not about the representation (i.e. the bits on a disk), various publications may have spatial/temporal attributes which are unique to the publication. On the relational side, there's currently some redundancy with lat/lon etc fields in multiple tables which should probably be refactored. The schema supports mapping arbitrary publications to regions, and the long term plan has always been to connect these regions to postGIS, after which the individual lat/lon fields could be removed. (Note, of course, that the semantic representation can have intermediary concepts if it makes more sense)

justgo129 commented 9 years ago

While I agree with @aulenbac and @bduggan, I am wondering if those in the Semantic Web community not involved with the production of GCIS would also broach some of the questions initially asked by @aulenbac. I am thinking that having something like "gcis:contentofWhichHasSpatialExtent" in-lieu of "hasSpatialExtent" (clearly something shorter though) could alleviate some of the confusion while addressing the comment of @zednis that

"I don't think it makes sense to say that a report, report chapter, report section, report finding, field experiment, platform, or instrument directly have spatial extents or spatial resolution."

The addition of such a term would satisfy me for the purposes of spatial Extent, not so much for spatial resolution.

zednis commented 9 years ago

I think this discussion has migrated away from what has temporal/spatial resolution to what has temporal/spatial coverage.

As for spatial coverage - I think geosparql has influenced my thinking and it would probably be a good idea for the group to take a look at geosparql to:

use as an option for describing the geometry of spatial coverages (bounding boxes, points, polygons, etc) as opposed to our own properties
see if geosparql has any properties for spatial resolution and if so where those properties are applicable
understand the reasoning behind why/how geosparql separates geometry from features 3a. discuss whether reports/chapters/findings/etc. are valid features (Feature - a thing with a geometry)

Also, I will be at the OGC GeoSemantics Summit in Boulder this wednesday. I will give the group a report by email afterwards.

zednis commented 9 years ago

Geosparql links

zednis commented 9 years ago

I will create a new ticket for the discussion around Model, ModelRun, ModelOutput, and Dataset.

xgmachina commented 9 years ago

Did we find a solution to this issue? If we still want to retain those two properties, I would suggest we remove the assertions of domain.

zednis commented 9 years ago

I am not sure we converged on a solution to the original issue in this ticket.

I concur with @xgmachina's suggestion to remove the assertion of domain for the spatial and temporal resolution properties.

bduggan commented 9 years ago

:+1:

justgo129 commented 9 years ago

+1

justgo129 commented 9 years ago

Let's remove it and initiate the pull request.

USGCRP / gcis-ontology