whosonfirst / whosonfirst-www-spelunker

A simple Flask-based spelunker for poking around Who's On First data
BSD 3-Clause "New" or "Revised" License
7 stars 9 forks source link

cannot see id 1141909453 in spelunker #128

Closed missinglink closed 6 years ago

missinglink commented 6 years ago

heya, I'm getting a 404 error when trying to load 1141909453 in the spelunker

missinglink commented 6 years ago

also 1141907907

missinglink commented 6 years ago

also 1141906837

nvkelso commented 6 years ago

Looks like these are all related to same PRs:

Over to @thisisaaronland to investigate why the features are on Github but not on S3 or in ES for the Spelunker.

thisisaaronland commented 6 years ago

Which PR is this? There was one recently that triggered a gazillion updates which makes updated (well Redis, really) sad:

https://github.com/whosonfirst/go-whosonfirst-updated/issues/16

thisisaaronland commented 6 years ago

The data appears to have been replicated everywhere except ES so that narrows it down a bit...

nvkelso commented 6 years ago

The links above are for the history of those files. Not sure why the PR numbers aren't auto linking.

thisisaaronland commented 6 years ago

Something is causing the ES indexing process (or at least certain batches) to fail to index. Still investigating...

thisisaaronland commented 6 years ago

Something about these records makes ES unhappy...

/usr/local/bin/wof-es-index-files --index spelunker --host 127.0.0.1 --verbose ./data/114/190/790/7/1141907907.geojson
...
DEBUG:urllib3.util.retry:Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): 127.0.0.1
DEBUG:urllib3.connectionpool:http://127.0.0.1:9200 "POST /spelunker/locality/1141907907 HTTP/1.1" 400 279
DEBUG:mapzen.whosonfirst.elasticsearch:Finished call to 'mapzen.whosonfirst.elasticsearch.do_index' after 0.019(s), this was the 1st time calling it.
DEBUG:urllib3.util.retry:Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): 127.0.0.1
DEBUG:urllib3.connectionpool:http://127.0.0.1:9200 "POST /spelunker/locality/1141907907 HTTP/1.1" 400 279
DEBUG:mapzen.whosonfirst.elasticsearch:Finished call to 'mapzen.whosonfirst.elasticsearch.do_index' after 5.029(s), this was the 2nd time calling it.
DEBUG:urllib3.util.retry:Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): 127.0.0.1
DEBUG:urllib3.connectionpool:http://127.0.0.1:9200 "POST /spelunker/locality/1141907907 HTTP/1.1" 400 279
DEBUG:mapzen.whosonfirst.elasticsearch:Finished call to 'mapzen.whosonfirst.elasticsearch.do_index' after 10.038(s), this was the 3rd time calling it.
DEBUG:urllib3.util.retry:Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): 127.0.0.1
DEBUG:urllib3.connectionpool:http://127.0.0.1:9200 "POST /spelunker/locality/1141907907 HTTP/1.1" 400 279
DEBUG:mapzen.whosonfirst.elasticsearch:Finished call to 'mapzen.whosonfirst.elasticsearch.do_index' after 15.046(s), this was the 4th time calling it.
DEBUG:urllib3.util.retry:Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): 127.0.0.1
DEBUG:urllib3.connectionpool:http://127.0.0.1:9200 "POST /spelunker/locality/1141907907 HTTP/1.1" 400 279
DEBUG:mapzen.whosonfirst.elasticsearch:Finished call to 'mapzen.whosonfirst.elasticsearch.do_index' after 20.055(s), this was the 5th time calling it.
ERROR:root:failed to index http://127.0.0.1:9200/spelunker/locality/1141907907: RetryError[<Future at 0x7f5e3fdd5350 state=finished raised Exception>]

This also happens for 1141909453

thisisaaronland commented 6 years ago

I hhhhhhhhhhaaaaaaaaaaattttttttteeeeeeessssssssss you Elasticsearch, I hates you...

'{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"failed to parse [ne:MAX_POP10]"}],"type":"mapper_parsing_exception","reason":"failed to parse \
[ne:MAX_POP10]","caused_by":{"type":"number_format_exception","reason":"For input string: \\"0.0\\""}},"status":400}'
thisisaaronland commented 6 years ago

because we can't have nice things...

https://github.com/whosonfirst/py-mapzen-whosonfirst-search/commit/f7deaf392a81d21f3875d302d0adbe46090b12ba#diff-c1683b61159fb490d1baf4bc31d42bb0R410

missinglink commented 5 years ago

Hi @thisisaaronland, @stepps00, this bug seems to have returned :( Is it possible that the spelunker hasn't been synced since that record was added in November?

thisisaaronland commented 5 years ago

I'll take a look shortly.

stepps00 commented 5 years ago

Looks like a portion of the GeoNames locality imports have not made their way into the Spelunker yet. I've also opened https://github.com/whosonfirst/whosonfirst-www-spelunker/issues/141 to track.

missinglink commented 5 years ago

Got caught by the same issue again today :( https://spelunker.whosonfirst.org/id/1327145573/

That ID is found in my sqlite database dated 12 Feb but not found in the Spelunker.