inspire-eu-rdf / inspire-rdf-guidelines

INSPIRE data in RDF
http://inspire-eu-rdf.github.io/inspire-rdf-guidelines/
13 stars 4 forks source link

AddressComponents are features? #31

Open DieterDePaepe opened 7 years ago

DieterDePaepe commented 7 years ago

ad:AddressComponent and its subclasses (ad:AdminUnitName, ad:AddressAreaName, ...) are subclasses of gsp:Feature.

I find the definition of gsp:Feature itself to be lacking in that it simply refers to ISO 19107 - a paying specification. Either way, there are 2 major interpretations of what a Feature is: a geographical feature, or a broader term.

In case the first is meant, why is something that represents a name a geographical entity? That does not make sense.

In case the broader term is followed which includes non-geographical features, a features is simply a "record". One can ask if not every RDF class is in fact a feature, rendering this addition useless.

DieterDePaepe commented 7 years ago

Finding the definition of SpatialObject seems to verify the first case:

The class spatial-object represents everything that can have a spatial representation. It is superclass of feature and geometry.

cportele commented 7 years ago

The feature concept is introduced in ISO 19101 and detailed in ISO 19109 (Rules for Application Schemas). This is the basis for the use of "feature" in both OGC standards and the ISO 19100 series.

Historically, the term "feature" was sometimes used for an object with the spatial geometry property. In the ISO/OGC standards, which are also the basis for INSPIRE, it has a broader meaning:

The classification of real-world phenomena as features depends on their significance to a particular universe of discourse.

I.e., declaring something as a feature is mainly a statement that something is important (for a certain use of spatial data). Things that are features will typically be the resources in a spatial data set that one will access / query (via a Web Feature Service, GeoSPARQL, as resources in a RESTful API, etc).

This is consistent with the definition in the INSPIRE Directive where it is sufficient to have an "indirect reference to a specific location or geographical area" to be considered spatial data. In this sense, an ad:AdminUnitName is a feature.

Note that the use of "spatial object" in the text of the INSPIRE Directive is inconsistent with the use in the ISO/OGC standards, including ISO 19107 and GeoSPARQL. "Spatial object" in INSPIRE is basically the same as "feature" in ISO 19109. In GeoSPARQL, for example, "spatial object" is a supertype of "feature" and "geometry".

PS: I know, it is a pain that the ISO standards are paywalled!

DieterDePaepe commented 7 years ago

I disagree. The historical meaning of the term has no authority here. By letting an ad:AdminUnitName be a subclass of a SpatialObject, you are saying that a ad:AdminUnitName [...] can have a spatial representation.

As long as the address components refer to names rather than the actual things (a street, an administrative unit, ...), they cannot have a spatial representation in my opinion.

cportele commented 7 years ago

Yes, the historical meaning is not relevant, I only provided it as background. The main point was that AdminUnitName is a feature (in the terminology of ISO 19109) / spatial object (in the terminology of INSPIRE). This is also explicitly stated in the INSPIRE data specifications and the legal act.

But I see your point that questions the benefits/correctness of making classes that are unlikely to be linked with a property that has a geometry as its value a sub-class of the GeoSPARQL class gsp:Feature.

While GeoSPARQL references ISO 19109 with respect to the definition of "feature", it also makes the assumption that a gsp:Feature typically has a geometric property by defining the gsp:hasGeometry property. In this sense, a feature in GeoSPARQL is really restricted to geographic entities.

So, my conclusion would also be to remove that sub-class relationship from all classes that do not have a geometry property in the INSPIRE model. Thanks for raising this. It is a topic that I will bring up in the planned revision of GeoSPARQL, too.

@jechterhoff, do you see any reason against such a change?

DieterDePaepe commented 7 years ago

A different possibility would be dropping the "Name" part of each of the address components (Eg: Thoroughfare instead of ThoroughfareName)? I'd have to check against the definitions, but at first sight I'd say a Thoroughfare is a Feature.

During my first experience with INSPIRE (address), I did find it very weird everything was using names instead of the actual objects, despite being defined as features. I simply assumed there was a good reason for it and learned to reason using names, but maybe there wasn't a good reason?

PS: please also suggest having better (non ISO-references) definitions in their terms. ;)

cportele commented 7 years ago

In the address schema, these are really meant to be just names. They all have a property that links them to a feature (or multiple features) in other schemas. AdminUnitName has a property adminUnit where the value is a AdministrativeUnit (in the administrative units schema), ThoroughfareName has a property transportLink where the value is any number of TransportLink (in the transport networks schema), etc.

My understanding is that this approach was used, because typically an address dataset will only have the name, but for example not the geometry of the administrative unit (nor a link to the adminitrative unit in another dataset). Since the geometry must be provided for an AdministrativeUnit feature, the XxxName features were created.

However, since the current proposal is to not include multiplicity contraints in the INSPIRE RDF vocabularies (http://inspire-eu-rdf.github.io/inspire-rdf-guidelines/#ref_cr_prop_multiplicity), we could indeed drop the XxxName classes from the address namespace and directly use AdministrativeUnit, TransportLink and NamedPlace without any loss. In fact, the model would be "cleaner".

jechterhoff commented 7 years ago

@cportele As far as I can see - and I hope I'm not missing something, GeoSPARQL does not require that every gsp:Feature must have a geometry. The mere definition of gsp:hasGeometry in GeoSPARQL does not lead to such a requirement (a minimum cardinality restriction would). If there really is no such requirement in the current version of GeoSPARQL, then declaring that an ad:AdminUnitName is a gsp:Feature is correct.

A user that searches for all resources that are gsp:Features should not expect that each of them has a geometry, because 1) a geometry might not have been defined for the gsp:Feature yet, and 2) a gsp:Feature is not required to have a geometry. Still, such a search would return all <<featureType>>s defined in the INSPIRE conceptual model, but not data types, for example.

I am not sure if the information that an XxxName is a gsp:Feature will be needed. Time will tell. Until a real use case occurs, we could remove the assertion that the XxName classes (or, as you said, classes without any geometry property) are subclasses of gsp:Feature. Then again, there is no real harm in making such an assertion (since it would be correct).

DieterDePaepe commented 7 years ago

There is indeed no requirement that a gsp:Feature needs to have a Geometry, but there is the requirement that it should be able to have a spatial representation, which a "name" cannot.

If you want to be able to query for every <<featureType>>, then you indeed need a specific class to represent it, but gsp:Feature is not suited for that.

jechterhoff commented 7 years ago

Well, I (have to) disagree on these two statements. A gsp:Feature may, but does not need to have a geometry (or spatial extent, in whatever context). According to its definition, gsp:Feature precisely covers all instances of classes with stereotype <<featureType>> from the INSPIRE schemas.

But, as I said before, we can remove the assertion that the XxName classes (or, as you said, classes without any geometry property) are subclasses of gsp:Feature. We can make this change in the next revision of the guidelines (and, accordingly, the ontologies) unless we received more feedback that directs us otherwise.

Regarding the proposal from Clemens to replace the XxName classes with their actual counterparts (like AdministrativeUnit): That is doable, if the additional properties introduced by class AddressComponent (alternativeIdentifier, status, situatedWithin) are not needed by RDF applications or are covered by the counterparts of its subtypes (like AdministrativeUnit, etc). The RDF pilots may help answer this question, also the review of actual INSPIRE address data.

DieterDePaepe commented 7 years ago

Well, I (have to) disagree on these two statements. A gsp:Feature may, but does not need to have a geometry (or spatial extent, in whatever context). According to its definition, gsp:Feature precisely covers all instances of classes with stereotype <> from the INSPIRE schemas.

Hmm, let me defend my case. The gsp:Feature is defined (through SpatialObject) as [...] represents everything that can have a spatial representation. I interpret this is as: "It is possible that the feature does not have spatial representation (Geometry) linked to it, but it should be theoretically possible to assign it one. Even if this will never actually happen, it can still be a feature, as long as the possibility is there." I feel that that a name can never be (correctly) assigned a spatial representation of that name, hence my objection that the address components are gsp:Features.

What do you base yourself on that gsp:Feature fully represent the stereotype? I am not an INSPIRE expert, so I might just be missing context.

Regarding the proposal from Clemens to replace the XxName classes with their actual counterparts (like AdministrativeUnit): That is doable, if the additional properties introduced by class AddressComponent (alternativeIdentifier, status, situatedWithin) are not needed by RDF applications or are covered by the counterparts of its subtypes (like AdministrativeUnit, etc). The RDF pilots may help answer this question, also the review of actual INSPIRE address data.

Basing this on the decision whether RDF application need the data now is dangerous. The expressiveness should be there even in the absence of an immediate use case.

alternativeIdentifier External, thematic identifier of the address component spatial object [...] Seems to refer to the real world object. Should not be defined on AddressComponent but on the real world object.

status Validity of the address component within the life-cycle (version) of the address component spatial object. Seems to indicate a status on the name itself? The domain could be changed to GeographicalName (or a subclass) to preserve the functionality.

situatedWithin Another address component within which the geographic feature represented by this address component is situated. Defined on the address components, but defers the meaning to the associated real world object. Could be changed to use real world objects without losing expressiveness.

jechterhoff commented 7 years ago

Here is the reasoning behind my statement that gsp:Feature covers all instances of classes with stereotype <<featureType>> (where that stereotype represents the meta class GF_FeatureType from ISO 19109 - which also provides the basis for INSPIRE feature types, see the INSPIRE Generic Conceptual Model clause 9.2 [1]):

Clause 6.2.2 "Class: geo:Feature" in the GeoSPARQL standard [2] states:

The class geo:Feature is equivalent to the class GFI_Feature

Clause C.2.1 "GFI_Feature" in the Observations & Measurements standard [3] states:

The class GFI_Feature is an instance of the «metaclass» GF_FeatureType (ISO 19109). It represents the set of all feature instances.

[1] INSPIRE Generic Conceptual Model [2] OGC GeoSPARQL - A Geographic Query Language for RDF Data [3] OGC Abstract Specification - Geographic information — Observations and measurements

cportele commented 7 years ago

I started a discussion about the gsp:Feature topic in the W3C/OGC Spatial Data on the Web working group and the conclusion so far supports the argument of @jechterhoff that it is correct to classify every ISO 19109 feature (including every INSPIRE feature) as a gsp:Feature.

https://lists.w3.org/Archives/Public/public-sdw-wg/2017Apr/0210.html

I.e., if we keep the name classes, it would be correct with GeoSPARQL to state that these are a gsp:Feature.

In that sense, the comment in gsp:SpatialObject that it "represents everything that can have a spatial representation" without clarifying this more explicitly is probably misleading and would best be amended in a revision.

jechterhoff commented 7 years ago

A future revision of the draft vocabularies should review this issue in detail. The guidelines document references the issue in section Spatial object type - Alignment.