Open knaveofdiamonds opened 1 year ago
Yeah seems there's some checks missing that give more context and avoid nasty surprises. Can't tell either if it's spatialite or geos causing problems, but since we moved to geos c api, I doubt it's that. Would you be up for investigating this a bit more?
Finding out min versions for dependencies would be an involved process, ideally we'd have CI check that, but that still requires lots of work, most likely building a range of versions from source and exercising all the code.
I'm currently fiddling around with almalinux8 which has ancient geos 3.7 but spatialite 5.0. It'll take a longer while until I'd go ahead with building admins for the whole planet there, but maybe I remember to check for that.
At any rate, investigating what you found would still be worthwhile regardless of min versions.
I would caution that half the time one admin or another is actually screwed up in osm itself. this is why in practice we don't rebuild the admins every time we create data. if we did routing would be broken half the time. if you are saying that the osm data is good but the code doesn't handle it that's another matter but I'd like to see info backing that up before heading out on the goose chase as it were 😄
if you are saying that the osm data is good but the code doesn't handle it
yes, the OSM data is good - I actually tried with another version of the admin polygons sourced from big query and had the same issues loading into sqlite: so to be clear - I don't think there is any problem with the Valhalla code & how it builds the polygons etc. just a dependency issue. Also, the same OSM data works fine on the dependency versions provided in Ubuntu 22.04 with no valhalla code changes.
Would you be up for investigating this a bit more?
I'm happy to look at detecting the problem and logging out a warning message - hopefully this would help anyone else who encounters the problem by making it really clear the admin database is likely broken when this happens. I'm assuming there is no easy way of specifying the dependencies as we just rely on whatever the OS package manager provides for these.
I'm assuming there is no easy way of specifying the dependencies as we just rely on whatever the OS package manager provides for these.
We could build our own dependencies with the versions we want as part of our CMake build or use third-party package managers more widely, e.g. conan/vcpkg. That does add quite a bit more maintenance hassle to the project and is smth I'd personally like to avoid. EDIT: frankly crap like protobuf was causing lots of headache recently, so doing our own would've certainly been more efficient than working around their fuckups.
I've been toying with the idea of packaging Valhalla & and its dependencies separately though: https://github.com/gis-ops/valhalla-osrm-libs. It's a coincidence that I happened to write that up today:) As someone who's using Valhalla (& OSRM) in all kinds of settings/environments, this idea appeals a lot me.
On Ubuntu 20.04, or probably more relevantly with spatialite 4.3.0 and/or libgeos 3.8.0 valhalla_build_admins cannot cope with all the country geometries (on a planet latest file downloaded around 2023-08-1). The impact is that some countries (USA, Ghana, a couple more) have their geometry end up being null, which means that the wrong countries are assigned to states when we're assigning the parent admin to the state (the ST_COVERS function returns the correct country, but also all the countries with null geometry).
The impact of all the above is that this "roundabout always returning 2nd exit" issue https://github.com/valhalla/valhalla/issues/2320 will happen even if you build the admins database from the full planet file on Ubuntu 20.04.
This might not be worth fixing, but is it worth doing some sort of check that the geom column entries aren't null after building all the admins, and erroring loudly if they are? I didn't get as far as figuring out whether the issue was due to spatialite or geos version, but might also be worth noting minimum versions for them?
Whilst I was researching this, I also noted that apparently the
rowid
column values aren't stable unless you include an explicitINTEGER PRIMARY KEY ASC
column, which is aliased to the rowid - see bullet point 6 in Quirks https://www.sqlite.org/rowidtable.html - this wasn't the cause of any issues, but easily might be I think, as we use them as effectively foreign keys.