pelias-deprecated / fences-cli

Fences CLI
4 stars 0 forks source link

missing relations #12

Open andrisi opened 9 years ago

andrisi commented 9 years ago

Perhaps there is an issue witht the prep tool. Austria, Slovenia and Brittany in France (and a lot of other relations) are missing from the mapzen planet file - presumably created with this tool, and are only mentioned in the errors.json file. These relations display OK on http://polygons.openstreetmap.fr while many others in the error.json file from mapzen files are not.

dianashk commented 9 years ago

@andrisi, you're right, I think you're absolutely right. I'm going to investigate further later this week. Will post findings. Thanks for your feedback.

andrisi commented 9 years ago

Thanks @dianashk! It would be wonderful if you could do something about this, and issue #11 before the monthly rebuild of the files. Your work helps me a lot. I'll see if I find any issues, at the moment I am trying to recreate the hierarchy of admin levels/boundaries.

dianashk commented 9 years ago

@andrisi, others have also asked for the hierarchy to be included in the data, so I'm considering the best way to approach that. Once id's are part of the data, we can include all parents from lower admin levels in each item. If you implement something sooner, we'd love a pull request! :smiley:

andrisi commented 9 years ago

@dianashk you jumpstarted me on editing OSM, I started adding/correcting stuff that I need from your files, but are not your faults. :-) Regarding the hierarchy, it's not that simple as "anything-within-me-on-one-admin-level-lower" (ST_Contains), because sometimes subareas peak outside their parent's area. I added some common sense heuristics, and plan to release the result as my first ever public GitHub project.

andrisi commented 9 years ago

I got some of the missing (erroneous) relations, like Brittany, France (102740) from http://polygons.openstreetmap.fr but it seems, that they seem to cause errors in PostGIS - so they are wrong in some way indeed. A self-join with ST_Contains causes "SQL Error: ERROR: GEOSContains: TopologyException: side location conflict at ...".

ST_IsValidReason() actually tells what the problem is: self-intersection. And ST_MakeValid() fixes it. Altough I don't know at what cost.

dianashk commented 9 years ago

That's really interesting.

I realize it's not super straight forward when it comes to determining parent relationships. Although we have the ability to simply do polygon inspections if we determine that their bounding boxes overlap or nest. I imagine the resulting structure looking something like this:

{
  admin_rel: {
   admin_level_3: ['foo', 'bar', 'baz'],
   admin_level_2:['United States of America']
  }
}

Probably will use names at first and then once we have id's switch over to using just those as well, so users can easily lookup parent geometries and tags. Would this be what you're looking to create?

I'm glad to hear you've taken to fixing the errors in osm directly!!! That's one of the main goals of this project. Also, if you release a node module that does this lookup, I'd love to just use it within fences to avoid reinventing the wheel.

andrisi commented 9 years ago

I think it's enought to have the parent's relation ID, no need for the names. And IDs are a must anyway. Which tool creates the errors.json? Because that actually has the relation IDs in it.

My code for building the hierarchy is in PHP, but no big deal, if it turns out to work well, we might reimplement it in nodejs. Basically first it walks from admin_level 2 downwards looking for immediate children with ST_Contains, and then for all remaining ones (about 50K out of 200K), looks for the closest parent, that intersects the boundary with the largest percentage.

andrisi commented 9 years ago

Haverford Township (rel. 3067317) causes another type of PostGIS error: "Relate Operation called with a LWGEOMCOLLECTION type. This is unsupported.". Instead of being just a MULTIPOLYGON(..., it's a GEOMETRYCOLLECTION(MULTIPOLYGON(... - because there is a single LINESTRING in it, in addition to the POLYGON.

The (temporary) solution for this is outlined at: http://postgis.17.x6.nabble.com/How-to-trim-a-GeometryCollection-to-get-a-MultiPolygon-td3556351.html

dianashk commented 9 years ago

@andrisi, FYI a new extract of this data has been published. There are now OSM id's in every record, and in most errors. I still haven't had much time to look into the missing relations, so you'll still see those.

andrisi commented 8 years ago

Thanks @dianashk! I only got to use the files now. IDs are fine. The Bahamas and Cuba are missing from the planet file, perhaps others too.