Closed tomtaylor closed 4 years ago
OK, so I've inspected the geojson in the sqlite database more thoroughly and it looks like the places that haven't been imported have is_alt
set true:
# Holme Valley, works fine
sqlite> SELECT id, source, is_alt FROM geojson WHERE id = 1360754629;
1360754629|quattroshapes|0
# Huddersfield, doesn't
sqlite> SELECT id, source, is_alt FROM geojson WHERE id = 101750573;
101750573|quattroshapes|1
101750573|quattroshapes_pg|1
# Nuneaton, doesn't
sqlite> SELECT id, source, is_alt FROM geojson WHERE id = 101750471;
101750471|quattroshapes_pg|1
101750471|whosonfirst|1
# Sittingbourne, doesn't
sqlite> SELECT id, source, is_alt FROM geojson WHERE id = 101853501;
101853501|quattroshapes|1
101853501|quattroshapes_pg|1
101853501|whosonfirst|1
# Hackney, works fine
sqlite> SELECT id, source, is_alt FROM geojson WHERE id = 1158857273;
1158857273|gbr-datalondon|0
It looks like this is expected behaviour with the whosonfirst importer.
I'm now thinking this might be an issue with how the SQLite distribution is generated... @missinglink it looks like you might be working on something related?
Sounds a lot like the bug I fixed yesterday. https://github.com/pelias/wof/pull/13
Please try downloading the SQLite database again and checking the same IDs, you should find exactly one record with is_alt=0
per ID.
Thanks @missinglink - I don't think whosonfirst-data-admin-gb-latest.db.bz2
has updated yet. I still get the same results with the new file. Is this still rolling out or did something go awry?
Can you please post a shasum
of the database file and paste a query that shows no is_alt=0
, I'll have a look tomorrow.
Agh damn I think you're right https://github.com/whosonfirst-data/whosonfirst-data-admin-gb/blob/master/data/101/853/501/101853501-alt-whosonfirst.geojson
I'll figure out a fix
Sure thing, thanks for that.
> sqlite shasum -a 256 whosonfirst-data-admin-gb-latest.db.bz2
044dc0e263647a487dc192740f7619ee1536c8cf3f8c927a1d7f09e862cb0c09 whosonfirst-data-admin-gb-latest.db.bz2
> sqlite shasum -a 256 whosonfirst-data-admin-gb-latest.db
d6a43a27bc6fd6412400d3b679e5c1a417b58fd7fc59a9cf14c05531c00c992b whosonfirst-data-admin-gb-latest.db
> sqlite sqlite3 whosonfirst-data-admin-gb-latest.db
SQLite version 3.28.0 2019-04-15 14:49:49
Enter ".help" for usage hints.
sqlite> SELECT id, source, is_alt FROM geojson WHERE id = 101750573;
101750573|quattroshapes|1
101750573|quattroshapes_pg|1
sqlite>
Fix merged in https://github.com/pelias/wof/pull/16, data files are being regenerated by @pelias-bot
Great, thank you!
shasum -a 256 whosonfirst-data-admin-gb-latest.db
14d758d982e0d2661563ce761fc7d079df981a4eee1cf11d694fa28dbebf4e69 whosonfirst-data-admin-gb-latest.db
sqlite3 whosonfirst-data-admin-gb-latest.db 'SELECT id, source, alt_label, is_alt FROM geojson WHERE id = 101750573;'
101750573|quattroshapes||0
101750573|quattroshapes|quattroshapes|1
101750573|quattroshapes_pg|quattroshapes_pg|1
looks like it was fixed, the files are generated alphabetically and it's up to 'H' so they'll all get uploaded in the next couple of hours.
curl 'https://data.geocode.earth/wof/dist/sqlite/whosonfirst-data-admin-gb-latest.db.bz2' | lbunzip2 | tee >(shasum -a 256) > whosonfirst-data-admin-gb-latest.db
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 367M 100 367M 0 0 13.4M 0 0:00:27 0:00:27 --:--:-- 15.8M
14d758d982e0d2661563ce761fc7d079df981a4eee1cf11d694fa28dbebf4e69 -
sqlite3 whosonfirst-data-admin-gb-latest.db 'SELECT id, source, alt_label, is_alt FROM geojson WHERE id = 101750573;'
101750573|quattroshapes||0
101750573|quattroshapes|quattroshapes|1
101750573|quattroshapes_pg|quattroshapes_pg|1
Thanks for the bug report, the store.sqlite3.gz
file we are hosting will also need regeneration so I'll kick that off now, it takes hours to complete.
If the problem is solved for you please close the github issue. FYI we just recently started an OpenCollective, we are hoping to use the funds to hire someone part time to keep the community assets/code up-to-date.
This issue should now be resolved?
Please reopen if you find it's not fixed.
I've just set up a test Pelias installation locally using Docker. I'm using this
pelias.json
to load in the whole of the UK, and running the following commands:(I don't need street/address geocoding.)
Most of the places I'd expect to be present have loaded in fine, but I'm missing some places that should be present. For example: Nuneaton, Huddersfield, Sittingbourne. They all exist in my local WOF sqlite database, but aren't present in the ElasticSearch index. They work fine on the geocode.earth online tool.
Take Huddersfield. It's not in the ElasticSearch index:
While a sibling locality, Holme Valley, has loaded just fine:
I've run the
pelias import wof
multiple times, with no errors produced. And I've tried to flush the index too.Is there a way of debugging why they might not be getting loaded?