Closed iverase closed 1 year ago
Pinging @elastic/es-analytics-geo
I think I overshoot here due to my expectations (orientation defines insideness of the polygon) which is not what it is documented.
After looking at the implementation and reading careful the documentation, it seems consistent the behaviour. If the orientation of the polygon is different to the provided orientation, then the resulting polygon is the one where the length of the edges is lower than half of the hemisphere.
This is the current behavior the way I understand it. There are 5 factors that affect how we treat the polygon:
1) if we applied both clockwise and counterclockwise orientation to the given polygon, would the smaller polygon (one where the length of the edges is lower than half of the hemisphere) cross the dateline? If the answer is no
then we ignore all other factors and take the smaller polygon. Otherwise, the behavior depends on
2) Is this polygon represented as WKT or JSON? If the polygon is represented as WKT then we only have 1 thing to check:
3) What is the ordering of coordinates in the shape (clockwise for outer shell by default)? We interpret the the polygon using standard counterclockwise, the orientation
value in the mapping is ignored. However if the polygon is presented as Json, we need to check 2 more factors:
4) What is the value of the orientation
parameter in GeoJson? If this is set to left
or cw
we will treat coordinates as clockwise
polygon. Otherwise we will check mapping
5) What is the value of the value of the orientation
parameter in mapping. If it is set to left
or cw
we will treat coordinates as clockwise
polygon, otherwise as counterclockwise polygon.
Here is a script that can be used to demonstrate this logic:
DELETE test
PUT test
{
"mappings": {
"properties": {
"shape": {
"type": "geo_shape"
},
"anti_shape": {
"type": "geo_shape",
"orientation": "cw"
}
}
}
}
PUT test/_doc/1
{
"name": "WKT, dateline",
"shape": "POLYGON ((160 0, 160 10, -160 10, -160 0, 160 0))",
"anti_shape": "POLYGON ((160 0, 160 10, -160 10, -160 0, 160 0))"
}
PUT test/_doc/2
{
"name": "WKT, dateline, reversed coordinates",
"shape": "POLYGON ((160 0, -160 0, -160 10, 160 10, 160 0))",
"anti_shape": "POLYGON ((160 0, -160 0, -160 10, 160 10, 160 0))"
}
PUT test/_doc/3
{
"name": "Geo json, dateline, default orientation",
"shape": {
"type": "polygon",
"coordinates": [[[160, 0], [160, 10], [-160, 10], [-160, 0], [160, 0]]]
},
"anti_shape": {
"type": "polygon",
"coordinates": [[[160, 0], [160, 10], [-160, 10], [-160, 0], [160, 0]]]
}
}
PUT test/_doc/4
{
"name": "Geo json, dateline revesed orientation",
"shape": {
"type": "polygon",
"coordinates": [[[160, 0], [160, 10], [-160, 10], [-160, 0], [160, 0]]],
"orientation": "cw"
},
"anti_shape": {
"type": "polygon",
"coordinates": [[[160, 0], [160, 10], [-160, 10], [-160, 0], [160, 0]]],
"orientation": "ccw"
}
}
PUT test/_doc/5
{
"name": "Geo json, dateline, default orientation, reversed coordinates",
"shape": {
"type": "polygon",
"coordinates": [[[160, 0], [-160, 0], [-160, 10], [160, 10], [160, 0]]]
},
"anti_shape": {
"type": "polygon",
"coordinates": [[[160, 0], [-160, 0], [-160, 10], [160, 10], [160, 0]]]
}
}
PUT test/_doc/6
{
"name": "WKT, not on dateline",
"shape": "POLYGON ((20 0, 20 10, -20 10, -20 0, 20 0))",
"anti_shape": "POLYGON ((20 0, 20 10, -20 10, -20 0, 20 0))"
}
PUT test/_doc/7
{
"name": "Geo json, not on dateline, default orientation",
"shape": {
"type": "polygon",
"coordinates": [[[20, 0], [20, 10], [-20, 10], [-20, 0], [20, 0]]]
},
"anti_shape": {
"type": "polygon",
"coordinates": [[[20, 0], [20, 10], [-20, 10], [-20, 0], [20, 0]]]
}
}
PUT test/_doc/8
{
"name": "Geo json, not on dateline, revesed orientation",
"shape": {
"type": "polygon",
"coordinates": [[[20, 0], [20, 10], [-20, 10], [-20, 0], [20, 0]]],
"orientation": "cw"
},
"anti_shape": {
"type": "polygon",
"coordinates": [[[20, 0], [20, 10], [-20, 10], [-20, 0], [20, 0]]],
"orientation": "ccw"
}
}
PUT test/_doc/9
{
"name": "Geo json, not on dateline, reversed coordinates, default orientation",
"shape": {
"type": "polygon",
"coordinates": [[[20, 0], [-20, 0], [-20, 10], [20, 10], [20, 0]]]
},
"anti_shape": {
"type": "polygon",
"coordinates": [[[20, 0], [-20, 0], [-20, 10], [20, 10], [20, 0]]]
}
}
GET test/_search?_source_excludes=*shape
{
"query": {
"geo_shape": {
"shape": {
"shape":"POINT (0 5)",
"relation": "intersects"
}
}
}
}
GET test/_search?_source_excludes=*shape
{
"query": {
"geo_shape": {
"anti_shape": {
"shape":"POINT (0 5)",
"relation": "intersects"
}
}
}
}
GET test/_search?_source_excludes=*shape
{
"query": {
"geo_shape": {
"shape": {
"shape":"POINT (179 5)",
"relation": "intersects"
}
}
}
}
GET test/_search?_source_excludes=*shape
{
"query": {
"geo_shape": {
"anti_shape": {
"shape":"POINT (179 5)",
"relation": "intersects"
}
}
}
}
GET test/_search?_source_excludes=*shape
{
"query": {
"bool": {
"should": [
{
"geo_shape": {
"shape": {
"shape": "POINT (179 5)",
"relation": "intersects"
}
}
},
{
"geo_shape": {
"anti_shape": {
"shape": "POINT (179 5)",
"relation": "intersects"
}
}
}
]
}
}
}
Hey @imotov just wanted to chime in with a use case where the logic you described is causing an issue:
I'm working on a search engine to find NOAA environmental data. Some of our most valuable data products are built using data from the GOES-R series satellites, which are a pair of geostationary satellites centered roughly on the east and west coasts of the US. The extent of these data products reaches from the Japanese coast to central Europe; it's a bounding box which legitimately covers more than 220° of longitude and crosses both meridians. It looks like this in counterclockwise GeoJSON:
{
"type": "Polygon",
"orientation": "ccw",
"coordinates": [[
[141.7005, -81.3282],
[6.2995, -81.3282],
[6.2995, 81.3282],
[141.7005, 81.3282],
[141.7005, -81.3282]
]],
}
The result of the logic you described in # 1 of your post results in the selection of the smaller, clockwise polygon which does not cross the dateline, i.e. most everywhere on Earth that is not visible to these satellites.
After running some experiments it looks like we can work around the issue by translating one of the edges by 360, e.g. so the bbox runs from 141° to 366°. Still, I find it frustrating that you're choosing to index the opposite geometry of what we intend, even if we indicate its orientation
explicitly.
Do you think there's any chance this logic will change in the future? Or if perhaps some kind of strict orientation mode could be enabled via the field mapping?
Hi @mcquinne,
I am wondering why you are describing your shape as a polygon instead of describing it as a bounding box, is there any reason in particular?
Regarding the current topology model, I agree it can be misleading. We are discussing internally how to transition to a more clear model but of course we need to guarantee backwards compatibility. Still your polygon is incorrect in the sense that if considering a proper ellipsoidal model, the edge between two points should be defined as the shortest path, therefore you cannot have edges bigger than 180 degrees.
It seems the setting
orientation
for geo_shape has no effect. Here is an example:My expectations are that those polygons are different and mutually exclusive, but they seem to represent the same polygon regardless of the order of the vertices.