Open aferrar opened 7 months ago
Can you provide steps to reproduce including how you generated or where you got the data you imported?
Skip the AI generated output please.
Certainly, thank you for the reply and sorry for the AI stuff. I think this would summarize it nicely, please let me know if I can provide any additional information and thank you again for the support and this awesome project!
Please see below:
docker run -d --name='Nominatim' --net='bridge' -e TZ="America/Time_Zone" -e HOST_OS="Unraid" -e HOST_HOSTNAME="Tower" -e HOST_CONTAINERNAME="Nominatim" -e 'PBF_PATH'='/gis/osm/planet/planet-latest.osm.pbf' <-- Taken from here -e 'NOMINATIM_TOKENIZER'='icu' -e 'NOMINATIM_DATABASE_DSN'='pgsql:dbname=nominatim;host=192.168.1.2;user=postgres;password=password;port=25432' -e 'PGHOST'='192.168.1.2' -e 'PGDATABASE'='nominatim' -e 'PGUSER'='postgres' -e 'PGPASSWORD'='password' -e 'PGPORT'='25432' -e 'REPLICATION_URL'='https://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/replication/day/' -e 'IMPORT_US_POSTCODES'='true' -e 'IMPORT_WIKIPEDIA'='true' -e 'IMPORT_TIGER_ADDRESSES'='true' -l net.unraid.docker.managed=dockerman -l net.unraid.docker.webui='http://[IP]:[PORT:1865]' -l net.unraid.docker.icon='https://avatars.githubusercontent.com/u/19649492?s=48&v=4' -p '1865:8080/tcp' -v '/mnt/user/gis/':'/gis':'rw' -v '/mnt/user/gis/nominatim/flatnode':'/nominatim/flatnode':'rw' --shm-size=64G 'mediagis/nominatim:4.3'
This error shouldn't happen and it's the first time I've seen a projection issue anywhere in Nominatim. What this import step does is read rows from the place
table and inserts them into the placex
table. In the database there's a trigger attached to the placex
table that runs some logic to reformat the place, lookup it's hierarchy. If you noticed that your CPU(s) go to 100% that's the logic that causes this. You can find it in https://github.com/osm-search/Nominatim/blob/master/lib-sql/functions/placex_triggers.sql
But both place
and placex
table only have geometry columns with SRID 4326 set. You can verify that by looking at the table schemas
\d place
Table "public.place"
Column | Type | Collation | Nullable | Default
-------------+-------------------------+-----------+----------+---------
osm_type | character(1) | | not null |
osm_id | bigint | | not null |
[...]
geometry | geometry(Geometry,4326) | | not null |
\d placex
Table "public.placex"
Column | Type | Collation | Nullable | Default
-----------------+-----------------------------+-----------+----------+---------
place_id | bigint | | not null |
parent_place_id | bigint | | |
linked_place_id | bigint | | |
[...]
geometry | geometry(Geometry,4326) | | not null |
[...]
centroid | geometry(Geometry,4326) | | |
I think we can rule out that you manually changed the table schema. Does a 5350 SRID even exist? I can't find any information and SELECT * FROM spatial_ref_sys WHERE srid = 5350;
returns empty for me.
What do the following queries return for you?
select count(*) from place; -- expect over 300 million
select count(*) from placex; -- unclear, depends where the import stopped
Possible the input data is corrupted. As first step check if the file size matches that on https://planet.openstreetmap.org/pbf/ As second step you can also compare the md5 checksum against the .pbf.md5
file. For example:
md5sum planet-240219.osm.pbf
fa4fce44b6de5b9d9fad9ad6e0bbf354
curl -L https://planet.openstreetmap.org/pbf/planet-240219.osm.pbf.md5
fa4fce44b6de5b9d9fad9ad6e0bbf354 planet-240219.osm.pbf
It's not impossible, just unlikely, that the database is corrupted somehow. I mean the data on disk.
Thank you for this awesome response. I've checked and 5350 isn't a valid SRID. I ran your count on the place table and returned 0 records. I assume placex will also yield zero. I'm currently running md5sum on my local pbf file and will compare it to the external one I curled from.
It could be the data got corrupted while curling? At this point, after confirming the checksum (and if it's valid), should I just blow away the nominatim db in my postgres instance and just try re-running the server? It took 36 hours to get to this error so it's quite a pain troubleshooting if it'll take the same amount of time on each troubleshooting step.
Thank you again, really appreciate your help!
Edit: Looks like a checksum mismatch so I assume the data got corrupted when I curled. I'll have to figured out what caused that. Is there a recommended curl command I should execute for the pbf download?
Indeed looks like the file got corrupted so best to start complete fresh. You said Monaco (I'm guessing 0.1% of the world) imported fine but I would try another country or US state (1-5% of the world) first before trying the world. While curl
or the internet connection wouldn't been at fault it's also possible it's the harddrive.
Describe the bug
While attempting to load world osm data into the
placex
table of a Nominatim docker installation using an external postgresql database, the process fails due to a mismatch between the SRID of the geometry being loaded and the SRID expected by the column in the table. The error reported isGeometry SRID (5350) does not match column SRID (4326)
, indicating that the geometries being imported are in a different spatial reference system than what theplacex
table expects.Traceback (most recent call last): File "/usr/local/bin/nominatim", line 12, in
exit(cli.nominatim(module_dir='/usr/local/lib/nominatim/module',
File "/usr/local/lib/nominatim/lib-python/nominatim/cli.py", line 225, in nominatim
return get_set_parser().run(**kwargs)
File "/usr/local/lib/nominatim/lib-python/nominatim/cli.py", line 121, in run
return args.command.run(args)
File "/usr/local/lib/nominatim/lib-python/nominatim/clicmd/setup.py", line 121, in run
database_import.load_data(args.config.get_libpq_dsn(), num_threads)
File "/usr/local/lib/nominatim/lib-python/nominatim/tools/database_import.py", line 222, in load_data
conn.wait()
File "/usr/local/lib/nominatim/lib-python/nominatim/db/async_connection.py", line 128, in wait
wait_select(self.conn)
File "/usr/lib/python3/dist-packages/psycopg2/extras.py", line 762, in wait_select
state = conn.poll()
psycopg2.errors.InvalidParameterValue: Geometry SRID (5350) does not match column SRID (4326)
CONTEXT: PL/pgSQL function placex_insert() line 15 at assignment
To Reproduce
docker run -d --name='Nominatim' --net='bridge' -e TZ="America/Time_Zone" -e HOST_OS="Unraid" -e HOST_HOSTNAME="Tower" -e HOST_CONTAINERNAME="Nominatim" -e 'PBF_PATH'='/gis/osm/planet/planet-latest.osm.pbf' <-- Taken from here -e 'NOMINATIM_TOKENIZER'='icu' -e 'NOMINATIM_DATABASE_DSN'='pgsql:dbname=nominatim;host=192.168.1.2;user=postgres;password=password;port=25432' -e 'PGHOST'='192.168.1.2' -e 'PGDATABASE'='nominatim' -e 'PGUSER'='postgres' -e 'PGPASSWORD'='password' -e 'PGPORT'='25432' -e 'REPLICATION_URL'='https://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/replication/day/' -e 'IMPORT_US_POSTCODES'='true' -e 'IMPORT_WIKIPEDIA'='true' -e 'IMPORT_TIGER_ADDRESSES'='true' -l net.unraid.docker.managed=dockerman -l net.unraid.docker.webui='http://[IP]:[PORT:1865]' -l net.unraid.docker.icon='https://avatars.githubusercontent.com/u/19649492?s=48&v=4' -p '1865:8080/tcp' -v '/mnt/user/gis/':'/gis':'rw' -v '/mnt/user/gis/nominatim/flatnode':'/nominatim/flatnode':'rw' --shm-size=64G 'mediagis/nominatim:4.3'
Expected behavior
The expected behavior was for the data to be loaded successfully into the
placex
table without encountering a SRID mismatch error. Geometries should align with the expected SRID (4326) of theplacex
table's geometry column.Screenshots & Logs
Not applicable as no screenshots or additional logs were provided. The error message detailed in the bug report serves as the primary diagnostic output.
Desktop / Server (please complete the following information):
Additional context
The error indicates a likely issue with the data being imported or a possible misconfiguration of the database schema. Further investigation into the source data's SRID and the database schema setup for the
spatial_reference
table show no existing 5350 SRID.Update
Thanks to mtmail, we were able to determine the issue was a corrupted pbf file. Thank you, mtmail!