whosonfirst / go-whosonfirst-pip-v2

An in-memory point-in-polygon (reverse geocoding) package for GeoJSON data, principally Who's On First data.
BSD 3-Clause "New" or "Revised" License
11 stars 5 forks source link

error: invalid character 'v' looking for beginning of value #15

Closed HIRANO-Satoshi closed 6 years ago

HIRANO-Satoshi commented 6 years ago

I would like to report that I got the following error during indexing.

./bin/wof-pip-server -port 8300 -mode directory ~/whosonfirst-data/data
12:25:55.267986 [wof-pip-server] STATUS listening on localhost:8300
12:25:56.272296 [wof-pip-server] STATUS indexing 3023 records indexed
12:25:57.271904 [wof-pip-server] STATUS indexing 6554 records indexed
12:25:58.273400 [wof-pip-server] STATUS indexing 10200 records indexed
12:25:59.276565 [wof-pip-server] STATUS indexing 13596 records indexed
......
12:34:36.252812 [wof-pip-server] STATUS indexing 778120 records indexed
12:34:37.251508 [wof-pip-server] STATUS indexing 778120 records indexed
error: invalid character 'v' looking for beginning of value
12:34:37.980572 [wof-pip-server] STATUS finished indexing

I used the latest data and pip-server-v2. whosonfirst-data commit 1fa48fd5b9b875fa016a86729d7c99a85e916a95 go-whosonfirst-pip-v2 commit d646dc151a6d8c52a395a4ba4dd430d590f3a417

OS: Mac OS X 10.12.6 go: go version go1.9.2 darwin/amd64

thisisaaronland commented 6 years ago

Thanks, I will investigate. That suggests that one of the underlying GeoJSON records was improperly encoded.

HIRANO-Satoshi commented 6 years ago

Thanks for your reply.

Sorry, I should do something related to git in ~/whosonfirst-data.

git lfs fetch
git lfs checkout
thisisaaronland commented 6 years ago

Ugh... yes, "git lfs". This is a perennial problem and one we're not entirely sure how to deal with, outside of better (more prominent) documentation. The issue is two-fold:

  1. GitHub (or any other hosted service provider) will always have a ceiling on file sizes, at least for the foreseeable future
  2. Any given WOF record will have a "ground truth" geometry (in addition to any other simplified versions, etc.) which, like Mandel's map of England, will only grow in size as measuring instruments get precise.

Currently this is only a problem for New Zealand (and the meta files, discussed below) which has a > 100MB geometry but since it could potentially happen for any country we have chosen to force the issue now. I apologize that you've gotten caught up in the ongoing work to make this easier than it is. You're not the only one... :-P

So in the meantime, yes: You need to remember to do the git lfs dance. If you can suggest or recommend some place in the documentation where it would be useful for us to bring this to people's attention that would be very helpful.

As mentioned "meta" files are being phased out of the core (Git) distributions in favour of cross-platform binary tools for generating them on demand or when a repo is updated, via Git hooks. This has been rolled out for all the repos except whosonfirst-data which does both for now:

https://github.com/whosonfirst-data/whosonfirst-data/blob/master/meta/README.md