SmalsResearch / NominatimWrapper

This tools can be seen as a wrapper around Nominatim (OpenStreetMap geocoder). It sends addresses to Nominatim (local or public instance), and for those giving no results, try several ways to make the addresses acceptable by Nominatim by "cleaning it"
MIT License
9 stars 1 forks source link

Installation 403 #5

Open ghost opened 1 year ago

ghost commented 1 year ago

I'm trying to install an instance of this project, when doing a full download for the planet with this config:

version: "3"
services:
  nominatim:
    container_name: nominatim
    image: mediagis/nominatim:4.2
    restart: always
    ports:
        - "8080:8080"
    environment:
            # see https://github.com/mediagis/nominatim-docker/tree/master/3.7#configuration for more options
#          PBF_URL: https://download.geofabrik.de/europe/monaco-latest.osm.pbf
#          REPLICATION_URL: https://download.geofabrik.de/europe/monaco-updates/
         PBF_URL: https://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/pbf/planet-latest.osm.pbf
         REPLICATION_URL: https://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/replication/day/
         NOMINATIM_PASSWORD: very_secure_password_1234
         IMPORT_WIKIPEDIA: "true"
         IMPORT_STYLE: address
    volumes:
         - nominatim-data:/var/lib/postgresql/12/main
    shm_size: 1gb

  photon:
    build:
        context: .
        dockerfile: Docker/Dockerfile_photon
        args:
         - photon_data=Docker/photon.tar.gz
        # network: host
    ports:
      - "2322:2322"

  libpostal:
    build:
        context: .
        dockerfile: Docker/Dockerfile_libpostal
        # network: host
    ports:
            - "7000:7000"
    environment:
      - NB_LPOST_WORKERS:1

  wrapper:
    build:
        context: .
        dockerfile: Docker/Dockerfile_wrapper
        # network: host
    ports:
      - "5000:5000"
    environment:
      - NB_WORKERS=8
      - PHOTON_HOST=photon:2322
      - LPOST_HOST=libpostal:7000
      - OSM_HOST=nominatim:8080
      - LOG_LEVEL=low
      - TIMING=no
      - FASTMODE=yes
      - HTTPS=yes
volumes:
    nominatim-data:

After having done ./full_build.sh which succeeds I ran docker-compose -f docker-compose-full.yml up I am getting these 403's:

100   153  100   153    0     0   1425      0 --:--:-- --:--:-- --:--:--  1429
nominatim    | curl: (22) The requested URL returned error: 403
nominatim    | + tailpid=0
nominatim    | + replicationpid=0
nominatim    | + trap stopServices SIGTERM TERM INT
nominatim    | + /app/config.sh
nominatim    | + id nominatim
nominatim    | user nominatim already exists
nominatim    | + echo 'user nominatim already exists'
nominatim    | + IMPORT_FINISHED=/var/lib/postgresql/14/main/import-finished
nominatim    | + '[' '!' -f /var/lib/postgresql/14/main/import-finished ']'
nominatim    | + /app/init.sh
nominatim    | + OSMFILE=/nominatim/data.osm.pbf
nominatim    | + CURL=("curl" "-L" "-A" "${USER_AGENT}" "--fail-with-body")
nominatim    | + '[' true = true ']'
nominatim    | Downloading Wikipedia importance dump
nominatim    | + echo 'Downloading Wikipedia importance dump'
nominatim    | + curl -L -A mediagis/nominatim-docker:4.2.3 --fail-with-body https://nominatim.org/data/wikimedia-importance.sql.gz -o /nominatim/wikimedia-importance.sql.gz
nominatim    |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
nominatim    |                                  Dload  Upload   Total   Spent    Left  Speed
100   153  100   153    0     0   1164      0 --:--:-- --:--:-- --:--:--  1167
nominatim    | curl: (22) The requested URL returned error: 403
nominatim    | + tailpid=0
nominatim    | + replicationpid=0
nominatim    | + trap stopServices SIGTERM TERM INT
nominatim    | + /app/config.sh
nominatim    | + id nominatim
nominatim    | + echo 'user nominatim already exists'
nominatim    | + IMPORT_FINISHED=/var/lib/postgresql/14/main/import-finished
nominatim    | + '[' '!' -f /var/lib/postgresql/14/main/import-finished ']'
nominatim    | + /app/init.sh
nominatim    | user nominatim already exists
nominatim    | + OSMFILE=/nominatim/data.osm.pbf
nominatim    | + CURL=("curl" "-L" "-A" "${USER_AGENT}" "--fail-with-body")
nominatim    | Downloading Wikipedia importance dump
nominatim    | + '[' true = true ']'
nominatim    | + echo 'Downloading Wikipedia importance dump'
nominatim    | + curl -L -A mediagis/nominatim-docker:4.2.3 --fail-with-body https://nominatim.org/data/wikimedia-importance.sql.gz -o /nominatim/wikimedia-importance.sql.gz
nominatim    |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
nominatim    |                                  Dload  Upload   Total   Spent    Left  Speed
100   153  100   153    0     0   1409      0 --:--:-- --:--:-- --:--:--  1416
nominatim    | curl: (22) The requested URL returned error: 403

Is this normal?

ghost commented 1 year ago

I fixed it by adding this to the docker composer:

USER_AGENT: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36"

But now when running the docker compose I get a photon error when starting the download

photon_1     | 2023-06-12 09:21:18,142 [main] INFO  de.komoot.photon.elasticsearch.Server - started elastic search node
photon_1     | 2023-06-12 09:21:18,142 [main] INFO  de.komoot.photon.App - Make sure that the ES cluster is ready, this might take some time.
photon_1     | 2023-06-12 09:21:18,148 [main] INFO  de.komoot.photon.App - ES cluster is now ready.
photon_1     | Exception in thread "main" [photon] IndexNotFoundException[no such index]
photon_1     |  at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:187)
photon_1     |  at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:123)
photon_1     |  at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteSingleIndex(IndexNameExpressionResolver.java:244)
photon_1     |  at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.<init>(TransportSingleShardAction.java:146)
photon_1     |  at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.<init>(TransportSingleShardAction.java:123)
photon_1     |  at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:95)
photon_1     |  at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:59)
photon_1     |  at org.elasticsearch.action.support.TransportAction.doExecute(TransportAction.java:146)
photon_1     |  at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:170)
photon_1     |  at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:142)
photon_1     |  at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:84)
photon_1     |  at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83)
photon_1     |  at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72)
photon_1     |  at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:408)
photon_1     |  at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:80)
photon_1     |  at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:54)
photon_1     |  at de.komoot.photon.elasticsearch.Server.loadFromDatabase(Server.java:245)
photon_1     |  at de.komoot.photon.elasticsearch.Server.updateIndexSettings(Server.java:193)
photon_1     |  at de.komoot.photon.App.main(App.java:45)
100  375M  100  375M    0     0  7927k      0  0:00:48  0:00:48 --:--:-- 8007k
nominatim    | + '[' '' = true ']'
nominatim    | + '[' -f '' ']'
nominatim    | + echo 'Skipping optional GB postcode import'
nominatim    | + '[' '' = true ']'
nominatim    | + '[' -f '' ']'
nominatim    | + echo 'Skipping optional US postcode import'
nominatim    | + '[' '' = true ']'
nominatim    | + '[' -f '' ']'
nominatim    | + echo 'Skipping optional Tiger addresses import'
nominatim    | + '[' https://download.bbbike.org/osm/planet/planet-latest.osm.pbf '!=' '' ']'
nominatim    | + echo Downloading OSM extract from https://download.bbbike.org/osm/planet/planet-latest.osm.pbf
nominatim    | + curl -L -A 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36' --fail-with-body https://download.bbbike.org/osm/planet/planet-latest.osm.pbf -C - --create-dirs -o /nominatim/data.osm.pbf
nominatim    | Skipping optional GB postcode import
nominatim    | Skipping optional US postcode import
nominatim    | Skipping optional Tiger addresses import
nominatim    | Downloading OSM extract from https://download.bbbike.org/osm/planet/planet-latest.osm.pbf
nominatim    | ** Resuming transfer from byte position 18782384128
nominatim    |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
nominatim    |                                  Dload  Upload   Total   Spent    Left  Speed

And now its frozen on the last line, and I'm not sure if its actually downloading anything

ghost commented 1 year ago

I replaced the PBF to be a smaller zone for testing purposes, I get the same results but it still goes through, but now I get this following error:

nominatim    | 2023-06-12 10:04:41    Processed 4465889 ways in 19s - 235k/s
nominatim    | 2023-06-12 10:04:41    Processed 96504 relations in 4s - 24k/s
nominatim    | 2023-06-12 10:04:41  Done postprocessing on table 'planet_osm_nodes' in 0s
nominatim    | 2023-06-12 10:04:41  Done postprocessing on table 'planet_osm_ways' in 0s
nominatim    | 2023-06-12 10:04:41  Done postprocessing on table 'planet_osm_rels' in 0s
nominatim    | 2023-06-12 10:04:41  osm2pgsql took 56s overall.
nominatim    | 2023-06-12 10:04:41: Importing wikipedia importance data
nominatim    | 2023-06-12 10:06:09: Importing secondary importance raster data
nominatim    | 2023-06-12 10:06:09: Secondary importance file not imported. Falling back to default ranking.
nominatim    | 2023-06-12 10:06:09: Create functions (1st pass)
nominatim    | 2023-06-12 10:06:09: Create tables
nominatim    | 2023-06-12 10:06:12: Create functions (2nd pass)
nominatim    | 2023-06-12 10:06:12: Create table triggers
nominatim    | 2023-06-12 10:06:12: Create partition tables
nominatim    | 2023-06-12 10:06:13: Create functions (3rd pass)
nominatim    | 2023-06-12 10:06:13: Initialise tables
nominatim    | 2023-06-12 10:06:13: Load data into placex table
nominatim    | Traceback (most recent call last):
nominatim    |   File "/usr/local/bin/nominatim", line 14, in <module>
nominatim    |     exit(cli.nominatim(module_dir='/usr/local/lib/nominatim/module',
nominatim    |   File "/usr/local/lib/nominatim/lib-python/nominatim/cli.py", line 264, in nominatim
nominatim    |     return parser.run(**kwargs)
nominatim    |   File "/usr/local/lib/nominatim/lib-python/nominatim/cli.py", line 126, in run
nominatim    |     return args.command.run(args)
nominatim    |   File "/usr/local/lib/nominatim/lib-python/nominatim/clicmd/setup.py", line 121, in run
nominatim    |     database_import.load_data(args.config.get_libpq_dsn(), num_threads)
nominatim    |   File "/usr/local/lib/nominatim/lib-python/nominatim/tools/database_import.py", line 193, in load_data
nominatim    |     conn = DBConnection(dsn)
nominatim    |   File "/usr/local/lib/nominatim/lib-python/nominatim/db/async_connection.py", line 74, in __init__
nominatim    |     self.connect(cursor_factory=cursor_factory)
nominatim    |   File "/usr/local/lib/nominatim/lib-python/nominatim/db/async_connection.py", line 99, in connect
nominatim    |     self.wait()
nominatim    |   File "/usr/local/lib/nominatim/lib-python/nominatim/db/async_connection.py", line 128, in wait
nominatim    |     wait_select(self.conn)
nominatim    |   File "/usr/lib/python3/dist-packages/psycopg2/extras.py", line 762, in wait_select
nominatim    |     state = conn.poll()
nominatim    | psycopg2.OperationalError: FATAL:  sorry, too many clients already
vberten commented 1 year ago

Hello Romeo,

ghost commented 1 year ago

I already successfully manually installed Nominatim using https://nominatim.org/release-docs/latest/admin/Installation/ but wanted to try out this project for its implementation of Libpostal. I know the what to expect for the size and time it can take to do this installation, but I do need a worldwide setup.

Would I be the first one to attempt a full planet on this project?

vberten commented 1 year ago

Hi Romeo,

Sorry for my late reply. I didn't get any feedback of anyone having installed a full planet setup, but NominatimWrapper is quite independent from Nominatim. It just requires to be able to connect a Nominatim server. For convenience, I provide a "full_build.sh" script which builds both Nominatim container and the 3 "internal" components of Nominatim (photon, libpostal and the wrapper itself), but if you have already a running Nominatim server, you'd better follow https://github.com/SmalsResearch/NominatimWrapper#appart-nominatim.

By the way, if you just want to test the Libpostal implementation (which is fully independent of the coverage of the Nominatim server), you can run "docker-compose -f docker-compose.yml build libpostal"

Best,

V.