pelias / docker

Run the Pelias geocoder in docker containers, including example projects.
MIT License
332 stars 223 forks source link

Planet build import all fails #217

Open agseekda opened 4 years ago

agseekda commented 4 years ago

Hi,

We are trying to setup a full planet build. All steps succeed until we get to run "pelias import all" which fails with:

ERROR: Elasticsearch index pelias does not exist
You must use the pelias-schema tool (https://github.com/pelias/schema/) to create the index first
For full instructions on setting up Pelias, see http://pelias.io/install.html
/code/pelias/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39
        throw new Error(`elasticsearch index ${config.schema.indexName} does not exist`);
        ^

Error: elasticsearch index pelias does not exist
    at existsCallback (/code/pelias/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39:15)
    at respond (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:368:9)
    at /code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:396:7
    at Timeout.<anonymous> (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:429:7)
    at listOnTimeout (internal/timers.js:531:17)
    at processTimers (internal/timers.js:475:7)

The smaller portland-metro build works find.

missinglink commented 4 years ago

What was the output when you ran pelias elastic create?

agseekda commented 4 years ago

I don't have it recorded now but it didn't show any errors and was successful.

pelias elastic wait
waiting for elasticsearch service to come up
Elasticsearch up!

seems fine.

curl http://localhost:9200/_cat/indices
green open pelias -iaupK1tRJasKckpkLxETg 12 0 0 0 3.3kb 3.3kb

Also looks OK except that it didn't write anything into it.

mubaldino commented 4 years ago

Exactly! I've been struggling with this one as well! My experience:

Docker invocation for pelias import works for all and all other routines, EXCEPT wof.

I tried bash -x pelias import wof. This revealed the actual docker compose command. I run that right in the terminal, which works fine:

docker-compose run  whosonfirst ./bin/start

Thoughts: elasticsearch URL or network is off or pelias.json is not properly mapped into wof docker image. None of these theories make sense since all other importer routines work fine, and in these cases I'm using mostly defaults. Nothing to override really.

xiaofengilove commented 3 years ago

I have a similar problem because the folder does not exist, the actual folder does exist $ pelias import all

Creating pelias_whosonfirst_run ... done /code/pelias/whosonfirst/node_modules/pelias-blacklist-stream/parser.js:11 throw new Error( 'file not found' ); ^

Error: file not found at load (/code/pelias/whosonfirst/node_modules/pelias-blacklist-stream/parser.js:11:11) at /code/pelias/whosonfirst/node_modules/pelias-blacklist-stream/loader.js:26:48 at Array.map () at loader (/code/pelias/whosonfirst/node_modules/pelias-blacklist-stream/loader.js:26:31) at blacklistStream (/code/pelias/whosonfirst/node_modules/pelias-blacklist-stream/index.js:22:15) at fullImport (/code/pelias/whosonfirst/src/importStream.js:16:11) at /code/pelias/whosonfirst/import.js:36:3 at Interface. (/code/pelias/whosonfirst/src/bundleList.js:120:5) at Interface.emit (events.js:228:7) at Interface.close (readline.js:402:8)

darmentrout commented 3 years ago

I'm getting a similar error as @xiaofengilove.

Creating extract at /data/placeholder/wof.extract
converting /data/openstreetmap/centralohio.osm.pbf to /data/polylines/extract.0sv
/code/pelias/placeholder/node_modules/pelias-blacklist-stream/parser.js:11
    throw new Error( 'file not found' );
    ^

Error: file not found
    at load (/code/pelias/placeholder/node_modules/pelias-blacklist-stream/parser.js:11:11)
    at /code/pelias/placeholder/node_modules/pelias-blacklist-stream/loader.js:26:48
    at Array.map (<anonymous>)
    at loader (/code/pelias/placeholder/node_modules/pelias-blacklist-stream/loader.js:26:31)
    at Object.<anonymous> (/code/pelias/placeholder/prototype/wof.js:6:60)
    at Module._compile (internal/modules/cjs/loader.js:955:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:991:10)
    at Module.load (internal/modules/cjs/loader.js:811:32)
    at Function.Module._load (internal/modules/cjs/loader.js:723:14)
    at Module.require (internal/modules/cjs/loader.js:848:19)
wrote polylines extract
-rw-r--r--. 1 1001 1001 1.7M Oct 19 18:14 /data/polylines/extract.0sv

/code/pelias/whosonfirst/node_modules/pelias-blacklist-stream/parser.js:11
    throw new Error( 'file not found' );
    ^

Error: file not found
    at load (/code/pelias/whosonfirst/node_modules/pelias-blacklist-stream/parser.js:11:11)
    at /code/pelias/whosonfirst/node_modules/pelias-blacklist-stream/loader.js:26:48
    at Array.map (<anonymous>)
    at loader (/code/pelias/whosonfirst/node_modules/pelias-blacklist-stream/loader.js:26:31)
    at blacklistStream (/code/pelias/whosonfirst/node_modules/pelias-blacklist-stream/index.js:22:15)
    at fullImport (/code/pelias/whosonfirst/src/importStream.js:16:11)
    at /code/pelias/whosonfirst/import.js:36:3
    at getDBList (/code/pelias/whosonfirst/src/bundleList.js:56:3)
    at Object.getList [as generateBundleList] (/code/pelias/whosonfirst/src/bundleList.js:61:12)
    at Object.<anonymous> (/code/pelias/whosonfirst/import.js:15:9)
darmentrout commented 3 years ago

@xiaofengilove Placing an empty file in the blacklist folder named osm.txt resolved the parser errors for me.

ipaleka commented 3 years ago

I had got the error from the first post running Pelias in Docker together with Elasticsearch instance on the host running at port 9200. My docker-compose.yml have the following part:


  elasticsearch:
    image: pelias/elasticsearch:7.5.1
    container_name: pelias_elasticsearch
    restart: always
    ports: [ "9400:9200", "9600:9300" ]
    volumes:
      - "${DATA_DIR}/elasticsearch:/usr/share/elasticsearch/data"

But pelias command and docker/cmd/elastic.sh still access the 9200 port and so I manually changed all the occurrences of 9200 port in elastic.sh script to 9400 and imports started afterward.

dan83g commented 2 years ago

Hi, the same error, only when importing whosonfirst data:

debug: [whosonfirst] Loading 'ocean' of whosonfirst-data-admin-latest.db database from /data/whosonfirst/sqlite
debug: [whosonfirst] Loading 'marinearea' of whosonfirst-data-admin-latest.db database from /data/whosonfirst/sqlite
debug: [whosonfirst] Loading 'continent' of whosonfirst-data-admin-latest.db database from /data/whosonfirst/sqlite
debug: [whosonfirst] Loading 'empire' of whosonfirst-data-admin-latest.db database from /data/whosonfirst/sqlite
debug: [whosonfirst] Loading 'country' of whosonfirst-data-admin-latest.db database from /data/whosonfirst/sqlite
debug: [whosonfirst] Loading 'dependency' of whosonfirst-data-admin-latest.db database from /data/whosonfirst/sqlite
debug: [whosonfirst] Loading 'disputed' of whosonfirst-data-admin-latest.db database from /data/whosonfirst/sqlite
debug: [whosonfirst] Loading 'macroregion' of whosonfirst-data-admin-latest.db database from /data/whosonfirst/sqlite
debug: [whosonfirst] Loading 'region' of whosonfirst-data-admin-latest.db database from /data/whosonfirst/sqlite
ERROR: Elasticsearch index pelias does not exist
You must use the pelias-schema tool (https://github.com/pelias/schema/) to create the index first
For full instructions on setting up Pelias, see http://pelias.io/install.html
/code/pelias/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39
        throw new Error(`elasticsearch index ${config.schema.indexName} does not exist`);
        ^

Error: elasticsearch index pelias does not exist
    at existsCallback (/code/pelias/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39:15)
    at respond (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:368:9)
    at /code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:396:7
    at Timeout.<anonymous> (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:429:7)
    at listOnTimeout (internal/timers.js:554:17)
    at processTimers (internal/timers.js:497:7)
dan83g commented 2 years ago

I just comment this string throw new Error('elasticsearch index ${config.schema.indexName} does not exist'); in node_modules/pelias-dbclient/test/configValidation.js and import process has been started. I think it is related to the connection timeout to elastic. I hope this will be useful

michaelkirk commented 1 year ago

I'm hitting an identical looking error with a planet build, and similarly, everything seems to work fine with the smaller portland build.

Given the long time it takes before the command errors (+2minutes), maybe (just a guess) the root problem is hitting the 120s timeout and it's being surfaced in a confusing way.

output
# Sanity check that I'd created the index:

$ time pelias elastic create                      
--------------                                                                                            
 create index 
--------------

[resource_already_exists_exception] index [pelias/FuIGtngnT5SiGwgBUl5Nyw] already exists, with { index_uuid="FuIGtngnT5SiGwgBUl5Nyw" & index="pelias" } 

real    0m13.242s                                                                                         
user    0m0.102s
sys     0m0.018s

$ time pelias import all
ERROR: Elasticsearch index pelias does not exist
You must use the pelias-schema tool (https://github.com/pelias/schema/) to create the index first         
For full instructions on setting up Pelias, see http://pelias.io/install.html
/code/pelias/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39                          
        throw new Error(`elasticsearch index ${config.schema.indexName} does not exist`);                 
        ^

Error: elasticsearch index pelias does not exist                                                          
    at existsCallback (/code/pelias/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39:15)
    at respond (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:368:9)           
    at /code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:396:7
    at Timeout. (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:429:7)
    at listOnTimeout (internal/timers.js:554:17)                                                          
    at processTimers (internal/timers.js:497:7)                                                           

real    2m16.389s                                                                                         
user    0m0.084s                                                                                          
sys     0m0.029s                                                                                          

$ time pelias import all                          
ERROR: Elasticsearch index pelias does not exist                                                                                                                                                                
You must use the pelias-schema tool (https://github.com/pelias/schema/) to create the index first
For full instructions on setting up Pelias, see http://pelias.io/install.html
/code/pelias/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39
        throw new Error(`elasticsearch index ${config.schema.indexName} does not exist`);
        ^

Error: elasticsearch index pelias does not exist
    at existsCallback (/code/pelias/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39:15)
    at respond (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:368:9)
    at /code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:396:7
    at Timeout. (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:429:7)
    at listOnTimeout (internal/timers.js:554:17)
    at processTimers (internal/timers.js:497:7)

real    2m49.125s
user    0m0.071s
sys     0m0.044s

This is running from within a checkout of https://github.com/pelias/docker, [^slightly modified] to use a different planet build and disable interpolation, but all the software versions are as specified as of https://github.com/pelias/docker/commit/6c50a65a397443a319d689e609d63e9d286632b5

While trying to debug this issue a couple weeks ago, I was running the command repeatedly, and it eventually just worked. But it's generally quite reproducible for me (unfortunately 😆), so let me know if there's any debugging information I can get to you.

[slightly modified]
git diff 6c50a65a397443a319d689e609d63e9d286632b5
diff --git a/projects/planet/.env b/projects/planet/.env                                                  
index d7f792c..5953470 100644                                                                             
--- a/projects/planet/.env
+++ b/projects/planet/.env                                                                                
@@ -1,3 +1,3 @@
 COMPOSE_PROJECT_NAME=pelias
-DATA_DIR=/tmp/pelias/data                                                                                
 ENABLE_GEONAMES=true                                                                                     
+DATA_DIR=./data
diff --git a/projects/planet/pelias.json b/projects/planet/pelias.json
index b3865b4..dc09cfe 100644                                                                             
--- a/projects/planet/pelias.json                                                                         
+++ b/projects/planet/pelias.json
@@ -27,7 +27,6 @@
     "services": {                                                                                        
       "placeholder": { "url": "http://placeholder:4100" },                                               
       "pip": { "url": "http://pip:4200" },                                                               
-      "interpolation": { "url": "http://interpolation:4300" },                                           
       "libpostal": { "url": "http://libpostal:4400" }                                                    
     }
   },
@@ -41,12 +40,12 @@                                                                                       
     },                                                                                                   
     "openstreetmap": {                                                                                   
       "download": [
-        { "sourceURL": "https://planet.openstreetmap.org/pbf/planet-latest.osm.pbf" }                    
+        { "sourceURL": "https://daylight-map-distribution.s3.us-west-1.amazonaws.com/release/v1.19/planet-v1.19.osm.pbf" }                                                                       
       ],
       "leveldbpath": "/tmp",
       "datapath": "/data/openstreetmap",                                                                 
       "import": [{                                                                                       
-        "filename": "planet-latest.osm.pbf"                                                              
+        "filename": "planet-v1.19.osm.pbf"                                                               
       }]
     },
     "openaddresses": {                                                                                   
~
michaelkirk commented 1 year ago

I'm hitting an identical looking error with a planet build, and similarly, everything seems to work fine with the smaller portland build.

Given the long time it takes before the command errors (+2minutes), maybe (just a guess) the root problem is hitting the 120s timeout and it's being surfaced in a confusing way.

A corroborating data point: After having the planet sized import fail a couple dozen times with the default 2 minute timeout, I specified a timeout of 10 minutes and was able to complete the import on the first try.

pelias config:

{
  "esclient": {
    "requestTimeout": "600000",
    ...
  },
  ...
}
ianthetechie commented 1 year ago

Really strange. I am able to confirm the issue as well on a more-or-less fresh clone (removed interpolation but that's about it). As noted by others above, curl instantly shows that the index is indeed there.

For some reason, it just takes a LONG time to do something at the start of the wof import. I can confirm that bumping the timeout to 10mins as @michaelkirk does above appears to resolve the issue. We aren't running on quite as beefy a server as the core team seems to be, but we have 12 cores and 64GB RAM, and it's essentially idle except for one node thread until getting over the initial hump. I'm commenting here since this is the more active thread, but I suspect the issue is with the wof importer rather than anything specific to the docker repo.

marq24 commented 1 year ago

Even with the latest pelias/docker version (in July 2023) I still run into this issue (that the wof import for a planet setup causing 'ERROR: Elasticsearch index pelias does not exist')...

After I have added to my pelias.json the 10min timeout (thanks @michaelkirk!) the importer started after a short while - there is still a error reported (but after that output it looks like the wof import is running fine):

USER@SERVER ../osm/bin/pelias-docker/projects/planet (git)-[master] % pelias import all
Creating pelias2_whosonfirst_run ... done
Elasticsearch ERROR: 2023-07-23T07:43:19Z
  Error: Request error, retrying
  HEAD http://elasticsearch:9200/pelias => socket hang up
      at Log.error (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/log.js:239:56)
      at checkRespForFailure (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:298:18)
      at HttpConnector.<anonymous> (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/connectors/http.js:171:7)
      at ClientRequest.wrapper (/code/pelias/whosonfirst/node_modules/lodash/lodash.js:4991:19)
      at ClientRequest.emit (node:events:513:28)
      at Socket.socketCloseListener (node:_http_client:467:11)
      at Socket.emit (node:events:525:35)
      at TCP.<anonymous> (node:net:301:12)
xiaofengilove commented 1 year ago

这是来自QQ邮箱的假期自动回复邮件。   您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

gmarti commented 1 year ago

I think the issue is that here https://github.com/pelias/dbclient/blob/master/src/configValidation.js#L34 If there is an error it logs that it doesn't exist And the error is silenced and nothing is printed.

xiaofengilove commented 1 year ago

这是来自QQ邮箱的假期自动回复邮件。   您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

michaelkirk commented 3 months ago

I think the issue is that here https://github.com/pelias/dbclient/blob/master/src/configValidation.js#L34

Thanks @gmarti - I opened up a PR for better logging at https://github.com/pelias/dbclient/pull/129

It also confirmed that indeed I was hitting a timeout.

I'm not sure why the request takes so long.

I am continuing to work around it by setting a luxurious timeout for planet imports: "requestTimeout": "600000"