Closed rustprooflabs closed 1 year ago
I can reproduce with the following, assuming I have the --osm-date
available locally.
export POSTGRES_USER=postgres
export POSTGRES_PASSWORD=mysecretpassword
docker run --name pgosm -d --rm \
-v ~/pgosm-data:/app/output \
-v /etc/localtime:/etc/localtime:ro \
-e POSTGRES_PASSWORD=$POSTGRES_PASSWORD \
-p 5433:5432 \
-d rustprooflabs/pgosm-flex:0.7.0.rc.1 \
-c max_connections=100 # 100 is default. Works for initial import, fails for subsequent import. See https://github.com/openstreetmap/osm2pgsql/discussions/1650
# Initial import works
docker exec -it \
pgosm python3 docker/pgosm_flex.py \
--ram=8 \
--region=north-america/us \
--subregion=district-of-columbia \
--pgosm-date 2022-12-30 \
--replication
# Subsequent refresh fails
docker exec -it \
pgosm python3 docker/pgosm_flex.py \
--ram=8 \
--region=north-america/us \
--subregion=district-of-columbia \
--replication
The second docker exec
command fails with
2023-01-18 19:44:27,991:INFO:pgosm-flex:helpers:2023-01-18 19:44:27 Processed 3472 nodes in 0s - 3k/s
2023-01-18 19:44:27,992:INFO:pgosm-flex:helpers:2023-01-18 19:44:27 Processed 1096 ways in 1s - 1k/s
2023-01-18 19:44:27,992:INFO:pgosm-flex:helpers:2023-01-18 19:44:27 Processed 176 relations in 0s - 176/s
2023-01-18 19:44:28,576:INFO:pgosm-flex:helpers:2023-01-18 19:44:28 ERROR: Connecting to database failed: connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL: sorry, too many clients already
2023-01-18 19:44:28,576:INFO:pgosm-flex:helpers:.
2023-01-18 19:44:28,581:INFO:pgosm-flex:helpers:Traceback (most recent call last):
2023-01-18 19:44:28,581:INFO:pgosm-flex:helpers:File "/usr/local/bin/osm2pgsql-replication", line 556, in <module>
2023-01-18 19:44:28,582:INFO:pgosm-flex:helpers:sys.exit(main())
2023-01-18 19:44:28,582:INFO:pgosm-flex:helpers:File "/usr/local/bin/osm2pgsql-replication", line 550, in main
2023-01-18 19:44:28,582:INFO:pgosm-flex:helpers:return args.handler(conn, args)
2023-01-18 19:44:28,582:INFO:pgosm-flex:helpers:File "/usr/local/bin/osm2pgsql-replication", line 410, in update
2023-01-18 19:44:28,582:INFO:pgosm-flex:helpers:subprocess.run(osm2pgsql, check=True)
2023-01-18 19:44:28,582:INFO:pgosm-flex:helpers:File "/usr/lib/python3.9/subprocess.py", line 528, in run
2023-01-18 19:44:28,582:INFO:pgosm-flex:helpers:raise CalledProcessError(retcode, process.args,
2023-01-18 19:44:28,582:INFO:pgosm-flex:helpers:subprocess.CalledProcessError: Command '['/usr/local/bin/osm2pgsql', '--append', '--slim', '--prefix', 'planet_osm', '--output=flex', '--style=./run.lua', '--slim', '-d', 'postgresql://postgres:mysecretpassword@localhost:5432/pgosm?application_name=pgosm-flex', '-d', 'postgresql://postgres:mysecretpassword@localhost:5432/pgosm?application_name=pgosm-flex', '/tmp/tmp5ggs5l4u/osm2pgsql_diff.osc.gz']' returned non-zero exit status 1.
2023-01-18 19:44:28,611:WARNING:pgosm-flex:pgosm_flex:Failure. Return code: 1
2023-01-18 19:44:28,611:INFO:pgosm-flex:pgosm_flex:Skipping pg_dump
2023-01-18 19:44:28,611:WARNING:pgosm-flex:pgosm_flex:PgOSM Flex completed with errors. Details in output
The info in osm.pgosm_flex
would indicate success :(
SELECT id, imported, osm_date, pgosm_flex_version,
osm2pgsql_mode
FROM osm.pgosm_flex
;
id|imported |osm_date |pgosm_flex_version|osm2pgsql_mode|
--+-----------------------------+----------+------------------+--------------+
1|2023-01-18 19:43:17.405 -0700|2022-12-30|0.7.0.rc.1-e4eec24|create |
2|2023-01-18 19:44:26.132 -0700|2023-01-18|0.7.0.rc.1-e4eec24|append |
It doesn't look like the run.lua
file can easily catch / report this error. It starts with...
-- Loads the `conf` var from layerset INI file
require "layerset"
require "style.pgosm-meta"
if conf['layerset']['amenity'] then
print('Including amenity')
require "style.amenity"
end
...
When the whole processing fails, the new row (from the first require) does get added. I don't think there is an easy way for style.pgosm-meta
to be aware of the Postgres connection failure, and I don't think we can guarantee that failure will always generate the row incorrectly reporting success.
However, this error should be easy enough to catch.
ERROR: Connecting to database failed: connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL: sorry, too many clients already
Then do a check on the timestamp of the latest row in osm.pgosm_flex
to ensure it's within the last 30 seconds. (That's to be paranoid in case there are cases where the new row does not get created. I don't want to update an old meta row to be a failure when a new import failed...)
We'll need a new column in osm.pgosm_flex
. Maybe success BOOLEAN
. Or, completed TIMESTAMPTZ
. The former requires an update on either success or failure (depending on approach). The latter would only need an update on success. If it's NULL
the import either a) hasn't completed or b) failed.
Not a trivial fix, not going to try to fix before 0.7.0.
What version of PgOSM Flex are you using?
0.7.0.beta.1
Docker image
0.7.0.beta.1
What did you do exactly?
I ran
docker exec ... --replication
to update a database that was configured withmax_connections=50
. Per https://github.com/openstreetmap/osm2pgsql/discussions/1650, this failed. The result was a failed update, yet theosm.pgosm_flex
table reported the import.What did you expect to happen?
The row with
id = 4
should either: a) Not exist; or b) Report the failureWhat did happen instead?
There is no indication other than the terminal output that anything went wrong.