osm2pgsql-dev / osm2pgsql

OpenStreetMap data to PostgreSQL converter
https://osm2pgsql.org
GNU General Public License v2.0
1.48k stars 473 forks source link

Postgre: Canceling autovacuum task #1300

Closed matzesoft closed 3 years ago

matzesoft commented 3 years ago

Hey, I am trying to import some OSM data to my PostgreSQL 13 database, named osm_bw. (I am definitely a beginner with Postgre, so stay cool with me for any stupid mistakes.)

Command: $ osm2pgsql -c -G -s -U matze -d osm_bw ./baden-wuerttemberg-latest.osm.pbf

Output:

osm2pgsql SVN version 0.84.0 (64bit id space)

Using projection SRS 900913 (Spherical Mercator)
Setting up table: planet_osm_point
NOTICE:  table "planet_osm_point_tmp" does not exist, skipping
Setting up table: planet_osm_line
NOTICE:  table "planet_osm_line_tmp" does not exist, skipping
Setting up table: planet_osm_polygon
NOTICE:  table "planet_osm_polygon_tmp" does not exist, skipping
Setting up table: planet_osm_roads
NOTICE:  table "planet_osm_roads_tmp" does not exist, skipping
Using built-in tag processing pipeline
Allocating memory for dense node cache
Allocating dense node cache in one big chunk
Allocating memory for sparse node cache
Sharing dense sparse
Node-cache: cache=800MB, maxblocks=102400*8192, allocation method=11
Mid: pgsql, scale=100 cache=800
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels

Reading in file: ./baden-wuerttemberg-latest.osm.pbf
Processing: Node(45680k 285.5k/s) Way(7564k 48.80k/s) Relation(96640 493.06/s)  parse time: 511s

Node stats: total(45680616), max(8038150161) in 160s
Way stats: total(7564522), max(862838767) in 155s
Relation stats: total(96642), max(11784370) in 196s
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads

Going over pending ways...
    5489404 ways are pending

Using 1 helper-processes
Helper process 0 out of 1 initialised          
Process 0 finished processing 5489404 ways in 1002 sec

All child processes exited

5489404 Pending ways took 1002s at a rate of 5478.45/s
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads

Going over pending relations...
    0 relations are pending

Using 1 helper-processes
Process 0 finished processing 0 relations in 0 sec

All child processes exited

Sorting data and creating indexes for planet_osm_point
node cache: stored: 45680616(100.00%), storage efficiency: 51.11% (dense blocks: 6147, sparse nodes: 41540920), hit rate: 100.00%
Sorting data and creating indexes for planet_osm_roads
Sorting data and creating indexes for planet_osm_line
Sorting data and creating indexes for planet_osm_polygon
Stopping table: planet_osm_nodes
Stopping table: planet_osm_ways
Building index on table: planet_osm_ways (fastupdate=off)
Stopping table: planet_osm_rels
Building index on table: planet_osm_rels (fastupdate=off)
Stopped table: planet_osm_nodes in 0s
Analyzing planet_osm_point finished
Analyzing planet_osm_roads finished
Analyzing planet_osm_polygon finished
Analyzing planet_osm_line finished
Copying planet_osm_roads to cluster by geometry finished
Creating geometry index on  planet_osm_roads
Stopped table: planet_osm_rels in 10s
Copying planet_osm_point to cluster by geometry finished
Creating geometry index on  planet_osm_point
Copying planet_osm_line to cluster by geometry finished
Creating geometry index on  planet_osm_line
Creating osm_id index on  planet_osm_point
Creating indexes on  planet_osm_point finished
All indexes on  planet_osm_point created  in 62s
Completed planet_osm_point
Copying planet_osm_polygon to cluster by geometry finished
Creating geometry index on  planet_osm_polygon
Creating osm_id index on  planet_osm_polygon
Creating indexes on  planet_osm_polygon finished
All indexes on  planet_osm_polygon created  in 247s
Completed planet_osm_polygon
Stopped table: planet_osm_ways in 285s

But after that the process is stuck - I even let my PC run for 2 days but still nothing happened. So I went deeper into looking what the problem is and started by using the .bz2-format instead of .obf, but still the same here. After that I just used a different map to test if my data wasn't ok. I used a quite small map in the thought of making testing faster but for my surprise didn't run into any errors or problems with this map anymore. So I thought of checking the database and took a look into the postgresql.log file:

2020-10-23 16:50:00.070 CEST [43706] LOG:  checkpoints are occurring too frequently (23 seconds apart)
2020-10-23 16:50:00.070 CEST [43706] HINT:  Consider increasing the configuration parameter "max_wal_size".
2020-10-23 16:50:22.859 CEST [43706] LOG:  checkpoints are occurring too frequently (22 seconds apart)
2020-10-23 16:50:22.859 CEST [43706] HINT:  Consider increasing the configuration parameter "max_wal_size".
2020-10-23 17:14:11.169 CEST [43706] LOG:  checkpoints are occurring too frequently (13 seconds apart)
2020-10-23 17:14:11.169 CEST [43706] HINT:  Consider increasing the configuration parameter "max_wal_size".
2020-10-23 17:14:27.053 CEST [43706] LOG:  checkpoints are occurring too frequently (16 seconds apart)
2020-10-23 17:14:27.053 CEST [43706] HINT:  Consider increasing the configuration parameter "max_wal_size".
2020-10-23 17:14:30.498 CEST [44698] ERROR:  canceling autovacuum task
2020-10-23 17:14:30.498 CEST [44698] CONTEXT:  while scanning block 105813 of relation "public.planet_osm_polygon" automatic vacuum of table "osm_bw.public.planet_osm_polygon"

I went into the postgresql.conf file and tried around with some different values for max_wal_size. But even setting it from the default of 1GB to 10GB didn't made a difference. So I just went completely crazy and set the value to 10000GB and... TADA! No hints of to checkpoints occurred anymore and I thought I made it - well no. The error canceling autovacuum task still happend after completed planet_osm_polygon was printed out.

And this is my final problem. I researched for some solutions but didn't find a lot I could do. Any tips or help are appreciated. Thanks in advance!

joto commented 3 years ago

First: The osm2pgsql version you are using is waaay to old. Please use the current release 1.3.0

Second: You need to tune your database. The section in the manual isn't finished yet, but this should give you some ideas.

matzesoft commented 3 years ago

Thanks! Now I was able to extract my data correctly. I still got some hints about too frequently checkpoints but the process still finished.

I followed a tutorial by TileMill on which a download link was to install osm2psql so I was sure I had the newest version and didn't checked back.

About second: I actually looked into the docs but as you already realized, they are kinda confusing currently. So good luck with the new website!