Open patrickbr opened 6 months ago
Can/Shall we replace the current geo:hasGeometry
with geo:hasDefaultGeometry
as we only provide a single geometry? If both are needed we would have provide the same information with two predicated:
osmObject geo:hasGeometry ourGeoObject .
osmObject geo:hasDefaultGeometry ourGeoObject .
Regarding the properties of geo:SpatialObject
object specs allows for these to be implemented but we are not required to add them all and some could never associate any meaningful value, e.g. the area of a way without width, or the volume of a point.
Regarding the properties of
geo:SpatialObject
object specs allows for these to be implemented but we are not required to add them all and some could never associate any meaningful value, e.g. the area of a way without width, or the volume of a point.
I am not so sure - according to RFC 2119, SHALL
means an absolute requirement. So afaik we must provide both.
@lehmann-4178656ch and I discussed this further.
I think a sane approach would be to omit the properties we cannot fill with any meaningful value to keep the dataset size manageable. For example, it seems extremely redundant to add geo:hasArea
properties with a value of 0 to each node and way in the dataset.
In this spirit, I would also not add the geo:hasDefaultGeometry
triple. It's just overly redundant.
Looking at the geoSPARQL specification...
geo:hasGeometry
, geo:hasDefaultGeometry
, geo:hasCentroid
and geo:hasBoundingBox
have geo:Feature
as domain and geo:Geometry
as rangegeo:Feature rdfs:subClassOf geo:SpatialObject
geo:Geometry rdfs:subClassOf geo:SpatialObject
geo:hasLength
, geo:hasArea
and all other properties "for associating Spatial Objects with scalar spatial measurements" have geo:SpatialObject
as domain and range, NOT geo:Geometry
geo:sfContains
, geo:sfIntersects
and all other properties in the "Simple Features relation family" have geo:SpatialObject
as domain and range, NOT geo:Geometry
This is visualized in this diagram at the beginning of section 6 and this other diagram from this paper
So:
a
geo:Feature
must have the following properties:geo:hasGeometry
,geo:hasDefaultGeometry
,geo:hasCentroid
andgeo:hasBoundingBox
osm:Node
s,osm:Way
s,osm:Relation
s andosm:Area
s should be of typegeo:Feature
and offer these properties.all of the properties
geo:hasGeometry
,geo:hasDefaultGeometry
,geo:hasCentroid
andgeo:hasBoundingBox
must then point to an object of typegeo:SpatialObject
. These must implementgeo:hasSize
,geo:hasMetricSize
,geo:hasLength
,geo:hasMetricLength
,geo:hasPerimeterLength
,geo:hasMetricPerimeterLength
,geo:hasArea
,geo:hasMetricArea
,geo:hasVolume
andgeo:hasMetricVolume
.
sfIntersects
andsfContains
should be properties betweengeo:SpatialObjects
I agree with all of the above
we cannot write queries like
SELECT ?osm_id ?hasgeometry WHERE { osmrel:1960198 ogc:sfContains ?osm_id . ?osm_id geo:hasGeometry/geo:asWKT ?hasgeometry }
anymore. They would then look like this:
SELECT ?osm_id ?hasgeometry WHERE { osmrel:1960198 geo:hasGeometry ?geoma . ?osm_id geo:hasGeometry ?geomb . ?geoma ogc:sfContains ?geomb . ?geomb geo:hasGeometry/geo:asWKT ?hasgeometry }
I believe this is not the case: given that
geo:Feature
is rdfs:subClassOf
geo:SpatialObject
geo:SpatialObject
as domain and rangethen these relations can also link geo:Feature
s to other geo:Feature
s, so the old syntax is still correct.
If I understand correctly this also means that if these triples hold...
:x geo:hasGeometry :xGeom.
:y geo:hasGeometry :yGeom.
:x geo:sfContains :y
...then also these hold...
:x geo:sfContains :yGeom.
:xGeom geo:sfContains :y.
:xGeom geo:sfContains :yGeom.
This would require doing some inference (combining geo:hasGeometry
with the base relation), either materialized in the triples or done dynamically at query-time.
I think a sane approach would be to omit the properties we cannot fill with any meaningful value to keep the dataset size manageable. For example, it seems extremely redundant to add
geo:hasArea
properties with a value of 0 to each node and way in the dataset.In this spirit, I would also not add the geo:hasDefaultGeometry triple. It's just overly redundant.
Given what you pointed out about SHALL and that 6.3 reads "Implementations shall allow the properties ... to be used in SPARQL graph patterns" this probably would break the formal full conformity with GeoSPARQL, but still, in my opinion it is an acceptable tradeoff.
Thank you all for this discussion. One way to realize redundant predicates is to just let the SPARQL engine know about 100% equivalent predicates, have the triples in the index for exactly one and then map each equivalent predicate to this one at query time.
The situation is not new, just the scale. For example, each of the 90 M Wikidata items has exactly one rdfs:label
triple and a 100% equivalent (and therefore redundant) schema:name
triple. We didn't care about this so far, since it's just 90 M additional triples compared to 19 B triples overall. But if these redundant triples blow up the total size of the dataset considerably, we should care.
Similarly, for predicate paths <x>/<y>
, where you never need the intermediate node (typically, a blank node), the index builder could just discard the blank node, internally create a simple predicate <x/y>
, and then map the path <x>/<y>
to <x/y>
at query time. if a query asks for the blank node in between at query time, we could either create it on the fly or issue an error message.
Hi @patrickbr @lehmann-4178656ch @Danysan1 @hannahbast @joka921, The desire to make your OSM representation GeoSPARQL compliant is highly appreciated!
GeoSPARQL is a voluminous and complex spec. I copy here two main GeoSPARQL experts @nicholascar @situx to correct what I write below in case I made mistakes.
Currently you have
osmnode:679109323
geo:hasGeometry osm2rdfgeom:osm_node_679109323 ;
osm2rdfgeom:convex_hull "..."^^geo:wktLiteral ;
osm2rdfgeom:envelope "..."^^geo:wktLiteral ;
osm2rdfgeom:obb "..."^^geo:wktLiteral .
osm2rdfgeom:osm_node_679109323 geo:asWKT "..."^^geo:wktLiteral .
But all these are alternative geometries so I suggest to change it to:
osmnode:679109323
geo:hasGeometry
osmnode:679109323/geom, osmnode:679109323/convexHull, osmnode:679109323/boundingBox, osmnode:679109323/orientedBoundingBox;
geo:hasDefaultGeometry osmnode:679109323/geom;
geo:hasBoundingBox osmnode:679109323/boundingBox;
.
osmnode:679109323/geom a geo:Geometry; osm2rdf:role "geometry"; geo:asWKT "..."^^geo:wktLiteral.
osmnode:679109323/convexHull a geo:Geometry; osm2rdf:role "convexHull"; geo:asWKT "..."^^geo:wktLiteral.
osmnode:679109323/boundingBox a geo:Geometry; osm2rdf:role "boundingBox"; geo:asWKT "..."^^geo:wktLiteral.
osmnode:679109323/orientedBoundingBox a geo:Geometry; osm2rdf:role "orientedBoundingBox"; geo:asWKT "..."^^geo:wktLiteral.
Notes:
hasGeometry
for all, hasDefaultGeometry
for the main (detailed) geometry, hasBoundingBox
for the envelope (I assume by "envelope" you mean the bounding box, right?)geo:hasCentroid
if you can compute it, but it's optional.osm2rdf:role
to allow the user to distinguish between them.osm2rdfmember
to just osm2rdf
, so the same predicate can be used here and in "members"Currently you have eg
osmnode:679109323 rdf:type osm:node
Please also add geo:Feature
as type.
It's ok to keep the topological relations at the level of Features, eg:
osmrel:3766584 ogc:sfContains osmway:264339544
As you can see in C.2.3.1. All features or geometries overlapping with another feature, the relations apply at both levels of Feature and Geometry, and by keeping them at the level of Feature, you implement only the first (most efficient) branch of the UNION.
You have materialized topological relations using an unofficial namespace like this:
@prefix ogc: <http://www.opengis.net/rdf#> .
osmrel:3766584 ogc:sfContains osmway:264339544
Please consider using geo:sfContains
(the official namespace). This has pros and cons:
geo:sfContains
. Eg in GraphDB, that predicate is not consulted in the database, but is passed to the geospatial index to process.I think you should use the standard predicate geo:sfContains
, but put those triples into separate dump files.
That way sem web developers can choose whether to load them to their repo, or let the repo compute the topological relations automatically.
BTW, have you implemented transitivity of sfContains
?
(This section applies to all topological relations that you support, not just sfContains
)
It's a good idea to provide measures if you can.
geo:hasMetricLength, geo:hasMetricPerimeterLength, geo:hasMetricArea
hasSize
) or don't apply (hasVolume
)Measures should be attached to Features not Geometries. Eg the Area of a boundingBox is typically bigger than the area of the detailed geometry, and only the latter is interesting.
No additions from my side. I think @VladimirAlexiev explained it very well. I would also be happy to see the dataset published using the GeoSPARQL vocabulary. If you find anything you would like to express but cannot express in GeoSPARQL, we are always happy to receive a pull request or an issue in the ogc-geosparql repository.
We should be fully conform with the GeoSPARQL standards for types geo:SpatialObject and geo:Feature.
In particular, a
geo:Feature
must have the following properties:geo:hasGeometry, geo:hasDefaultGeometry, geo:hasCentroid
andgeo:hasBoundingBox
That is,
osm:Node
s,osm:Way
s,osm:Relation
s andosm:Area
s should be of typegeo:Feature
and offer these properties.As far as I understand it, all of the properties
geo:hasGeometry, geo:hasDefaultGeometry, geo:hasCentroid and geo:hasBoundingBox
must then point to an object of typegeo:SpatialObject
. These must implementgeo:hasSize, geo:hasMetricSize, geo:hasLength, geo:hasMetricLength, geo:hasPerimeterLength, geo:hasMetricPerimeterLength, geo:hasArea, geo:hasMetricArea, geo:hasVolume
andgeo:hasMetricVolume
.So far, I don't see any problem with implementing this.
However, AFAIK (@lehmann-4178656ch, @Danysan1, please correct me) ,
sfIntersects
andsfContains
should be properties betweengeo:SpatialObject
s. This would mean that we cannot write queries likeanymore. They would then look like this:
@hannahbast, @joka921, is that a problem?
See also https://github.com/ad-freiburg/qlever/issues/678#issuecomment-1867066812