pelias / docker

Run the Pelias geocoder in docker containers, including example projects.
MIT License
336 stars 226 forks source link

Empty Result Problem #358

Closed akifaktas closed 1 month ago

akifaktas commented 1 month ago

Hi everyone. I have just joined the group. Pelias seemed useful for me, so I tried to set it up for the entire planet. I installed it on a virtual machine with 38 CPUs (118 MHz), 150 GB RAM, and 900 GB disk space. The installation took about 3 days. After 3 days, I was seeing the following messages in the terminal:

Thu Sep 12 06:27:49 AM UTC 2024 /data/tiger/downloads/tl_2021_72067_[addrfeat.zip](http://addrfeat.zip/)
Thu Sep 12 06:28:00 AM UTC 2024 /data/tiger/downloads/tl_2021_51147_[addrfeat.zip](http://addrfeat.zip/)
Thu Sep 12 06:28:13 AM UTC 2024 /data/tiger/downloads/tl_2021_39035_[addrfeat.zip](http://addrfeat.zip/)

interpolating vertices
real 0.28
user 0.25
sys 0.04
archiving address database
generating meta file
Build completed!

pelias import all
WARN[0000] /.../docker/projects/planet/docker-compose.yml: the attribute version is obsolete, it will be ignored, please remove it to avoid potential confusion
ERROR: Elasticsearch index pelias does not exist
You must use the pelias-schema tool (https://github.com/pelias/schema/) to create the index first
For full instructions on setting up Pelias, see http://pelias.io/install.html
/code/pelias/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39
throw new Error(elasticsearch index ${config.schema.indexName} does not exist);
^
Error: elasticsearch index pelias does not exist
at existsCallback (/code/pelias/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39:15)
at respond (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:368:9)
at /code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:396:7
at Timeout.<anonymous> (/code/pelias/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:429:7)
at listOnTimeout (node:internal/timers:559:17)
at processTimers (node:internal/timers:502:7)

pelias compose up
WARN[0000] /.../docker/projects/planet/docker-compose.yml: the attribute version is obsolete, it will be ignored, please remove it to avoid potential confusion
[+] Running 14/15
✔ Container pelias_whosonfirst Started 2.3s
✔ Container pelias_openaddresses Started 2.0s
⠏ Container pelias_fuzzy_tester Starting 2.9s
✔ Container pelias_elasticsearch Running 0.0s
✔ Container pelias_openstreetmap Started 1.1s
✔ Container pelias_interpolation Started 2.7s
✔ Container pelias_transit Started 2.6s
✔ Container pelias_schema Started 2.4s
✔ Container pelias_api Started 2.6s
✔ Container pelias_csv_importer Started 1.6s
✔ Container pelias_geonames Started 2.0s
✔ Container pelias_placeholder Started 1.5s
✔ Container pelias_polylines Started 2.6s
✔ Container pelias_libpostal Started 2.7s
✔ Container pelias_pip-service Started 2.9s
Error response from daemon: error while creating mount source path '/.../docker/projects/planet/test_cases': mkdir /.../docker/projects/planet/test_cases: file exists
pelias test run
WARN[0000] /.../docker/projects/planet/docker-compose.yml: the attribute version is obsolete, it will be ignored, please remove it to avoid potential confusion
Error response from daemon: error while creating mount source path '/.../docker/projects/planet/test_cases': mkdir /.../docker/projects/planet/test_cases: file exists
By the way disk usage is as follows:
df -h --block-size=G
Filesystem 1G-blocks Used Available Use% Mounted on
udev 74G 0G 74G 0% /dev
tmpfs 15G 1G 15G 1% /run
/dev/sda2 885G 338G 503G 41% /
tmpfs 74G 0G 74G 0% /dev/shm
tmpfs 1G 0G 1G 0% /run/lock
tmpfs 74G 0G 74G 0% /sys/fs/cgroup
/dev/loop0 1G 1G 0G 100% /snap/core20/1828
/dev/loop1 1G 1G 0G 100% /snap/lxd/24061
/dev/loop2 1G 1G 0G 100% /snap/snapd/18357
tmpfs 15G 0G 15G 0% /run/user/1000
/dev/loop3 1G 1G 0G 100% /snap/snapd/21759
/dev/loop4 1G 1G 0G 100% /snap/core20/2318
/dev/loop5 1G 1G 0G 100% /snap/lxd/29619

It looks like a 338 GB installation was also made.

Afterwards, I made the following API call: :4000/v1/autocomplete?text=Singapore

Unfortunately, the result I received was as follows::

{"geocoding":{"version":"0.2","attribution":"[http://myurl:4000/attribution","query":{"text":"Singapore","parser":"pelias","parsed_text":{"subject":"Singapore","locality":"Singapore"},"size":10,"layers":["venue","street","country","macroregion","region","county","localadmin","locality","borough","neighbourhood","continent","empire","dependency","macrocounty","macrohood","microhood","disputed","postalcode","ocean","marinearea"],"private":false,"lang":{"name":"Turkish","iso6391":"tr","iso6393":"tur","via":"header","defaulted":false},"querySize":20},"warnings":["performance](http://myurl:4000/attribution%22,%22query%22:%7B%22text%22:%22Singapore%22,%22parser%22:%22pelias%22,%22parsed_text%22:%7B%22subject%22:%22Singapore%22,%22locality%22:%22Singapore%22%7D,%22size%22:10,%22layers%22:[%22venue%22,%22street%22,%22country%22,%22macroregion%22,%22region%22,%22county%22,%22localadmin%22,%22locality%22,%22borough%22,%22neighbourhood%22,%22continent%22,%22empire%22,%22dependency%22,%22macrocounty%22,%22macrohood%22,%22microhood%22,%22disputed%22,%22postalcode%22,%22ocean%22,%22marinearea%22],%22private%22:false,%22lang%22:%7B%22name%22:%22Turkish%22,%22iso6391%22:%22tr%22,%22iso6393%22:%22tur%22,%22via%22:%22header%22,%22defaulted%22:false%7D,%22querySize%22:20%7D,%22warnings%22:[%22performance) optimization: excluding 'address' layer"],"engine":{"name":"Pelias","author":"Mapzen","version":"1.0"},"timestamp":1726155507342},"type":"FeatureCollection","features":[]}

So, the service is running, but it seems to return empty results. What could be causing this issue, and how can I resolve it? Thank you in advance for your support.

missinglink commented 1 month ago

The ERROR: Elasticsearch index pelias does not exist error message is concerning, did you run pelias elastic create?

You must use the pelias-schema tool (https://github.com/pelias/schema/) to create the index first

akifaktas commented 1 month ago

The contents of my .sh file are as follows. So yes, actually, the Elasticsearch command had worked.

set -x

# change directory to the where you would like to install Pelias
# cd /path/to/install

# clone this repository
git clone https://github.com/pelias/docker.git && cd docker

# install pelias script
# this is the _only_ setup command that should require `sudo`
sudo ln -s "$(pwd)/pelias" /usr/local/bin/pelias

# cd into the project directory
cd projects/planet

# create a directory to store Pelias data files
# see: https://github.com/pelias/docker#variable-data_dir
# note: use 'gsed' instead of 'sed' on a Mac
mkdir ./data
sed -i '/DATA_DIR/d' .env
echo 'DATA_DIR=./data' >> .env

# run build
pelias compose pull
pelias elastic start
pelias elastic wait
pelias elastic create
pelias download all
pelias prepare all
pelias import all
pelias compose up

# optionally run tests
pelias test run
missinglink commented 1 month ago

For bash scripts I'd recommend set -euxo pipefail which will exit on failure, just setting -x will not terminate if any command exits a non-zero status code.

That said, it very well might have succeeded, do you have the logs produced with -x?

Have a look inside your data dir ./data to see which directories are contained and their relative sizes. I would expect the elasticsearch directory to be quite large with many objects.

missinglink commented 1 month ago

If you failed to import the documents to elasticsearch then all is not lost, it sounds like you've already done all the lengthly prepare steps, so you won't need to do them again.

try running these commands to check the status of the elasticsearch index:

pelias elastic start
pelias elastic wait
pelias elastic status
pelias elastic info
pelias elastic stats
akifaktas commented 1 month ago

After you mentioned it, I checked and there was a situation like this. How can I overcome this?

pelias elastic stats

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [source] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "pelias",
        "node" : "bT2YMU7CT2y2zc2qt9e32A",
        "reason" : {
          "type" : "illegal_argument_exception",
          "reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [source] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
        }
      }
    ],
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [source] in order to load field data by uninverting the inverted index. Note that this can use significant memory.",
      "caused_by" : {
        "type" : "illegal_argument_exception",
        "reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [source] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
      }
    }
  },
  "status" : 400