whosonfirst / go-whosonfirst-tile38

Go package for working with Who's On First documents and the Tile38 datastore
BSD 3-Clause "New" or "Revised" License
2 stars 1 forks source link

Update to index existential flags #14

Closed thisisaaronland closed 6 years ago

thisisaaronland commented 6 years ago

Replaces https://github.com/whosonfirst/go-whosonfirst-tile38/issues/12 and https://github.com/whosonfirst/go-whosonfirst-tile38/issues/11

Depends on https://github.com/whosonfirst/go-whosonfirst-tile38/issues/13

thisisaaronland commented 6 years ago

This appears to mostly be working although anecdotally mz:is_current is not being indexed correctly. As in:

It is unclear whether these two things are related.

NEARBY whosonfirst WHERE mz:is_current -1 -1 POINTS POINT 45.52861 -73.575554 50000
{"ok":true,"fields":["wof:id","wof:placetype_id","wof:parent_id","mz:is_current","mz:is_deprecated","mz:is_ceased","mz:is_superseded","mz:is_superseding"],"points":[],"count":0,"cursor":0,"elapsed":"293.325385ms"}

GET whosonfirst 1108936295#whosonfirst-data WITHFIELDS POINT
 (error) id not found
GET whosonfirst 1108960939#whosonfirst-data WITHFIELDS POINT 
 (error) id not found
thisisaaronland commented 6 years ago
?method=whosonfirst.places.search&api_key=mapzen-xxxxxxx&placetype=venue&iso=CA&is_current=-1&extras=geom:,mz:&page=1&per_page=100

{
            "wof:id": 975052231,
            "wof:parent_id": "-1",
            "wof:name": "Badinotti Netting",
            "wof:placetype": "venue",
            "wof:country": "CA",
            "wof:repo": "whosonfirst-data-venue-ca",
            "geom:bbox": "-79.7197952271,44.4464492798,-79.7197952271,44.4464492798",
            "geom:longitude": -79.719795,
            "geom:type": "Point",
            "geom:area": 0,
            "geom:latitude": 44.446449,
            "geom:area_square_m": "0.0",
            "mz:categories": [],
            "mz:hierarchy_label": "1",
            "mz:is_current": "-1",
            "mz:filesize": "0",
            "mz:uri": "https://whosonfirst.mapzen.com/data/975/052/231/975052231.geojson"
        }

NEARBY whosonfirst LIMIT 1 WHERE mz:is_current -1 -1 POINTS POINT 44.446449 79.719795 50000
{"ok":true,"fields":["wof:id","wof:placetype_id","wof:parent_id","mz:is_current","mz:is_deprecated","mz:is_ceased","mz:is_superseded","mz:is_superseding"],"points":[],"count":0,"cursor":0,"elapsed":"205.283175ms"}
thisisaaronland commented 6 years ago

This might be a bug in go-whosonfirst-geojson-v2:

grep mz:is_current /usr/local/data/whosonfirst-data/data/110/893/629/5/1108936295.geojson
    "mz:is_current":-1,

./bin/wof-tile38-index -debug /usr/local/data/whosonfirst-data/data/110/893/629/5/1108936295.geojson 
2017/08/02 17:47:31 SET whosonfirst-neighbourhood 1108936295#whosonfirst-data FIELD wof:id 1108936295 FIELD wof:placetype_id 102312319 FIELD wof:parent_id 1108973843 FIELD mz:is_current 0 FIELD mz:is_deprecated 0 FIELD mz:is_ceased 0 FIELD mz:is_superseded 0 FIELD mz:is_superseding 0 ...
2017/08/02 17:47:31 SET whosonfirst-neighbourhood 1108936295#meta STRING {"wof:name":"Lakewood","wof:country":"CA"}
2017/08/02 17:47:31 SET 1108936295#whosonfirst-data
2017/08/02 17:47:31 SET 1108936295#meta
thisisaaronland commented 6 years ago

Nope, it's an indexing bug here:

go-whosonfirst-geojson-v2> ./bin/wof-geojson-existential /usr/local/data/whosonfirst-data/data/110/893/629/5/1108936295.geojson
2017/08/02 17:52:43 is current false (certainty: false)

go-whosonfirst-geojson-v2> ./bin/wof-geojson-existential /usr/local/data/whosonfirst-data/data/856/698/67/85669867.geojson
2017/08/02 17:53:46 is current false (certainty: true)
thisisaaronland commented 6 years ago

Better...

./bin/wof-tile38-index  -verbose /usr/local/data/whosonfirst-data/data/110/893/629/5/1108936295.geojson
2017/08/02 18:22:02 SET whosonfirst 1108936295#whosonfirst-data FIELD wof:id 1108936295 FIELD wof:placetype_id 102312319 FIELD wof:parent_id 1108973843 FIELD mz:is_current -1 FIELD mz:is_deprecated 0 FIELD mz:is_ceased 0 FIELD mz:is_superseded 0 FIELD mz:is_superseding 0 ...
2017/08/02 18:22:02 SET whosonfirst 1108936295#meta STRING {"wof:name":"Lakewood","wof:country":"CA"}
2017/08/02 18:22:02 SET 1108936295#whosonfirst-data
2017/08/02 18:22:02 SET 1108936295#meta

127.0.0.1:9851> GET whosonfirst 1108936295#whosonfirst-data WITHFIELDS POINT
{"ok":true,"point":{"lat":53.45863504476891,"lon":-113.45035692454066},"fields":{"mz:is_current":-1,"wof:id":1108936295,"wof:parent_id":1108973843,"wof:placetype_id":102312319},"elapsed":"47.119µs"}

127.0.0.1:9851> NEARBY whosonfirst WHERE mz:is_current -1 -1 POINTS POINT 53.458 -113.450 5000
{"ok":true,"fields":["wof:id","wof:placetype_id","wof:parent_id","mz:is_current","mz:is_deprecated","mz:is_ceased","mz:is_superseded","mz:is_superseding"],"points":[{"id":"1108936295#whosonfirst-data","point":{"lat":53.45863504476891,"lon":-113.45035692454066},"fields":[1108936295,102312319,1108973843,-1,0,0,0,0]}],"count":1,"cursor":0,"elapsed":"44.966085ms"}

Still unclear why this didn't get indexed the first time around. The mz:is_current stuff should not have triggered any errors...

thisisaaronland commented 6 years ago

Right, well there was this bug... https://github.com/whosonfirst/go-whosonfirst-geojson-v2/commit/959ba9325de62facb3641d71f24d065beb9318b1

thisisaaronland commented 6 years ago

More more better...

./bin/wof-geojson-existential /usr/local/data/whosonfirst-data/data/110/896/093/9/1108960939.geojson
2017/08/02 18:39:18 is current:true certainty: true raw:1
2017/08/02 18:39:18 is deprecated:false raw:
2017/08/02 18:39:18 is ceased:false raw:uuuu
2017/08/02 18:39:18 is superseded:false raw:[]
2017/08/02 18:39:18 is superseding:true raw:[
        85897635
    ]

./bin/wof-tile38-index -verbose /usr/local/data/whosonfirst-data/data/110/896/093/9/1108960939.geojson
2017/08/02 18:38:02 SET whosonfirst 1108960939#whosonfirst-data FIELD wof:id 1108960939 FIELD wof:placetype_id 102312321 FIELD wof:parent_id -1 FIELD mz:is_current 1 FIELD mz:is_deprecated 0 FIELD mz:is_ceased 0 FIELD mz:is_superseded 0 FIELD mz:is_superseding 1 ...
2017/08/02 18:38:02 SET whosonfirst 1108960939#meta STRING {"wof:name":"Westmount Square","wof:country":"CA"}
2017/08/02 18:38:02 SET 1108960939#whosonfirst-data
2017/08/02 18:38:02 SET 1108960939#meta

GET whosonfirst 1108960939#whosonfirst-data WITHFIELDS POINT
{"ok":true,"point":{"lat":45.487150654283745,"lon":-73.5880874125232},"fields":{"mz:is_current":1,"mz:is_superseding":1,"wof:id":1108960939,"wof:parent_id":-1,"wof:placetype_id":102312321},"elapsed":"63.833µs"}

127.0.0.1:9851> NEARBY whosonfirst WHERE mz:is_current 1 1 POINTS POINT 45.487 -73.588 5000
{"ok":true,"fields":["wof:id","wof:placetype_id","wof:parent_id","mz:is_current","mz:is_deprecated","mz:is_ceased","mz:is_superseded","mz:is_superseding"],"points":[{"id":"1108960939#whosonfirst-data","point":{"lat":45.487150654283745,"lon":-73.5880874125232},"fields":[1108960939,102312321,-1,1,0,0,0,1]}],"count":1,"cursor":0,"elapsed":"106.887667ms"}
thisisaaronland commented 6 years ago

Re-indexing dev and (eventually-new) prod servers...