pelias / whosonfirst

Importer for Who's on First gazetteer
MIT License
27 stars 43 forks source link

Sqlite import crashes in docker / k8s #452

Closed jdomag closed 5 years ago

jdomag commented 5 years ago

Every time i run importer with sqlite=true i see lot of errors like this:

{"level":"error","message":"doc generator error: invalid document type, expecting: string got: Neuburg am Inn","label":"whosonfirst","timestamp":"2019-06-06T10:13:13.748Z"}                                                                  
{"level":"error","message":"{\n  \"id\": 101913425,\n  \"name\": \"Neuburg a. Inn\",\n  \"name_aliases\": [\n    \"Neuburg am Inn\"\n  ],\n  \"name_langs\": {\n    \"hy\": [\n      \"Իննի Նոյբուրգ\"\n    ],\n    \"eu\": [\n      \"Neuburg
 am Inn\"\n    ],\n    \"zh\": [\n      \"因河畔诺伊布尔格\",\n      \"因河畔诺伊堡\"\n    ],\n    \"de\": [\n      [\n        \"Neuburg am Inn\"\n      ],\n      \"Neuburg am Inn\"\n    ],\n    \"nl\": [\n      \"Neuburg am Inn\"\n    ],
\n    \"en\": [\n      \"Neuburg am Inn\"\n    ],\n    \"eo\": [\n      \"Neuburg am Inn\"\n    ],\n    \"fa\": [\n      \"نویبورگ ام این\"\n    ],\n    \"fi\": [\n      \"Neuburg am Inn\"\n    ],\n    \"fr\": [\n      \"Neuburg am Inn\"\
n    ],\n    \"hu\": [\n      \"Neuburg am Inn\"\n    ],\n    \"id\": [\n      \"Neuburg am Inn\"\n    ],\n    \"it\": [\n      \"Neuburg am Inn\"\n    ],\n    \"kk\": [\n      \"Нойбург-на-Инне\"\n    ],\n    \"ms\": [\n      \"Neuburg a
m Inn\"\n    ],\n    \"pl\": [\n      \"Neuburg am Inn\"\n    ],\n    \"pt\": [\n      \"Neuburgo\"\n    ],\n    \"ro\": [\n      \"Neuburg am Inn\"\n    ],\n    \"ru\": [\n      \"Нойбург-на-Инне\"\n    ],\n    \"sr\": [\n      \"Нојбург
 ам Ин\"\n    ],\n    \"uk\": [\n      \"Нойбург-ам-Інн\"\n    ],\n    \"uz\": [\n      \"Neuburg am Inn\"\n    ],\n    \"vi\": [\n      \"Neuburg am Inn\"\n    ],\n    \"vo\": [\n      \"Neuburg am Inn\"\n    ]\n  },\n  \"place_type\": \
"locality\",\n  \"lat\": 48.506881,\n  \"lon\": 13.449444,\n  \"bounding_box\": \"13.3451363185,48.4776949752,13.4580396724,48.5566289876\",\n  \"population\": 4313,\n  \"hierarchies\": [\n    {\n      \"continent_id\": 102191581,\n      
\"country_id\": 85633111,\n      \"county_id\": 1377675609,\n      \"localadmin_id\": 1377684733,\n      \"locality_id\": 101913425,\n      \"macrocounty_id\": 404227559,\n      \"region_id\": 85682571\n    }\n  ]\n}","label":"whosonfirst
","timestamp":"2019-06-06T10:13:13.748Z"} 

And eventually whole import process crashes:


ocaladmin":4647,"level":"info","message":"","label":"dbclient-whosonfirst","timestamp":"2019-06-06T10:13:47.915Z"}
{"level":"info","message":"Loading whosonfirst-data-admin-pl-latest.db database from /data/whosonfirst/sqlite","label":"whosonfirst","timestamp":"2019-06-06T10:13:50.995Z"}
{"paused":true,"transient":5,"current_length":0,"indexed":93000,"batch_ok":186,"batch_retries":0,"failed_records":0,"neighbourhood":68452,"country":2,"region":32,"locality":17269,"persec":2700,"county":783,"macrocounty":19,"borough":12,"l
ocaladmin":6431,"level":"info","message":"","label":"dbclient-whosonfirst","timestamp":"2019-06-06T10:13:58.511Z"}
{"paused":true,"transient":5,"current_length":0,"indexed":115500,"batch_ok":231,"batch_retries":0,"failed_records":0,"neighbourhood":68552,"country":2,"region":32,"locality":39669,"persec":2250,"county":783,"macrocounty":19,"borough":12,"
localadmin":6431,"level":"info","message":"","label":"dbclient-whosonfirst","timestamp":"2019-06-06T10:14:09.114Z"}
{"level":"info","message":"Loading whosonfirst-data-latest.db database from /data/whosonfirst/sqlite","label":"whosonfirst","timestamp":"2019-06-06T10:14:11.089Z"}

/code/pelias/whosonfirst/src/components/sqliteStream.js:13
      .exec('CREATE INDEX IF NOT EXISTS spr_obsolete ON spr (is_deprecated, is_superseded)')
       ^
SqliteError: database disk image is malformed
    at new SQLiteStream (/code/pelias/whosonfirst/src/components/sqliteStream.js:13:8)
    at sqliteStream.append (/code/pelias/whosonfirst/src/readStream.js:82:12)
    at CombinedStream._realGetNext (/code/pelias/whosonfirst/node_modules/combined-stream/lib/combined_stream.js:104:3) 
    at CombinedStream._getNext (/code/pelias/whosonfirst/node_modules/combined-stream/lib/combined_stream.js:82:12)
    at SQLiteStream.emit (events.js:187:15)
    at endReadableNT (_stream_readable.js:1094:12)
    at process._tickCallback (internal/process/next_tick.js:63:19)
missinglink commented 5 years ago

I suspect that this is a download issue and not an import issue.

Can you please manually check the integrity of the sqlite database on the command-line:

sqlite3 whosonfirst-data-latest.db 'pragma quick_check'

This command takes a while to run but should report an error if your database is in fact malformed

missinglink commented 5 years ago

If the file you are using is whosonfirst-data-latest.db then you should expect a file size around 28GB

jdomag commented 5 years ago

I finally was able to download it after assigning 14GB RAM to k8s importer. It consumed 12.5GB eventually.