osm2pgsql-dev / osm2pgsql

OpenStreetMap data to PostgreSQL converter
https://osm2pgsql.org
GNU General Public License v2.0
1.5k stars 474 forks source link

Index only mode #799

Open rshadow opened 6 years ago

rshadow commented 6 years ago

Please add -- index-only mode to existing --append and --create modes.

I try load osm planet. The data has been uploaded. But at the stage of indexation, there was a сrash. Probably out of memory.

It would be desirable not to load again all the data (a few days), to complete the stage of indexing.

Log:

sudo -u tirex osm2pgsql --create                                                        \
        --keep-coastlines                                                                                       \
        --output pgsql --database gis                         \
        --hstore-all --hstore-add-index                                 \
        --slim --cache 32110 --number-processes 7 \
        --cache-strategy optimized                                  \
        --flat-nodes /var/lib/flat_nodes/nodes.cache                                  \
        --style "/srv/osm/config/carto/openstreetmap-carto.style"                                                               \
        "/srv/osm/src/planet-latest.osm.pbf"
osm2pgsql version 0.92.0 (64 bit id space)

Using built-in tag processing pipeline
Using projection SRS 3857 (Spherical Mercator)
Setting up table: planet_osm_point
Setting up table: planet_osm_line
Setting up table: planet_osm_polygon
Setting up table: planet_osm_roads
Allocating memory for dense node cache
Allocating dense node cache in one big chunk
Allocating memory for sparse node cache
Sharing dense sparse
Node-cache: cache=32110MB, maxblocks=513760*65536, allocation method=11
Mid: loading persistent node cache from /var/lib/flat_nodes/nodes.cache
Allocated space for persistent node cache file
Maximum node in persistent node cache: 0
Mid: pgsql, scale=100 cache=32110
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Reading in file: /srv/osm/src/planet-latest.osm.pbf
Using PBF parser.
Processing: Node(4160579k 2999.7k/s) Way(449114k 9.02k/s) Relation(1411380 33.78/s)
Standard exception processing relation id=2509838: TopologyException: side location conflict at 2029381.6000000001 6575291.75
Processing: Node(4160579k 2999.7k/s) Way(449114k 9.02k/s) Relation(3583520 48.28/s)
Standard exception processing relation id=5630039: TopologyException: side location conflict at 1987071.54 6601270.3499999996
Processing: Node(4160579k 2999.7k/s) Way(449114k 9.02k/s) Relation(4980590 55.02/s)
Standard exception processing relation id=7301465: TopologyException: side location conflict at 8275559.8899999997 6486836.4299999997
Processing: Node(4160579k 2999.7k/s) Way(449114k 9.02k/s) Relation(5337580 56.38/s)  parse time: 145842s
Node stats: total(4160579385), max(5199168802) in 1387s
Way stats: total(449114431), max(536786282) in 49789s
Relation stats: total(5337586), max(7692433) in 94666s
Maximum node in persistent node cache: 5199888383
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads
Mid: loading persistent node cache from /var/lib/flat_nodes/nodes.cache
Maximum node in persistent node cache: 5199888383
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using built-in tag processing pipeline
Mid: loading persistent node cache from /var/lib/flat_nodes/nodes.cache
Maximum node in persistent node cache: 5199888383
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using built-in tag processing pipeline
Mid: loading persistent node cache from /var/lib/flat_nodes/nodes.cache
Maximum node in persistent node cache: 5199888383
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using built-in tag processing pipeline
Mid: loading persistent node cache from /var/lib/flat_nodes/nodes.cache
Maximum node in persistent node cache: 5199888383
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using built-in tag processing pipeline
Mid: loading persistent node cache from /var/lib/flat_nodes/nodes.cache
Maximum node in persistent node cache: 5199888383
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using built-in tag processing pipeline
Mid: loading persistent node cache from /var/lib/flat_nodes/nodes.cache
Maximum node in persistent node cache: 5199888383
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using built-in tag processing pipeline
Mid: loading persistent node cache from /var/lib/flat_nodes/nodes.cache
Maximum node in persistent node cache: 5199888383
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using built-in tag processing pipeline

Going over pending ways...
        316151498 ways are pending

Using 7 helper-processes

Finished processing 316151498 ways in 56237 s

316151498 Pending ways took 56237s at a rate of 5621.77/s
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads

Going over pending relations...
        0 relations are pending

Using 7 helper-processes
Finished processing 0 relations in 0 s

Committing transaction for planet_osm_point
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_line
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_polygon
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_roads
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_point
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_line
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_polygon
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_roads
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_point
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_line
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_polygon
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_roads
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_point
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_line
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_polygon
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_roads
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_point
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_line
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_polygon
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_roads
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_point
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_line
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_polygon
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_roads
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_point
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_line
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_polygon
WARNING:  there is no transaction in progress
Committing transaction for planet_osm_roads
WARNING:  there is no transaction in progress
Sorting data and creating indexes for planet_osm_point
Sorting data and creating indexes for planet_osm_line
Sorting data and creating indexes for planet_osm_polygon
Sorting data and creating indexes for planet_osm_roads
Stopping table: planet_osm_ways
Building index on table: planet_osm_ways
Stopping table: planet_osm_nodes
Stopped table: planet_osm_nodes in 0s
Stopping table: planet_osm_rels
Building index on table: planet_osm_rels
Killed
lonvia commented 6 years ago

I'd say that this is something where a pull request would be welcome.

pnorman commented 6 years ago

I'm iffy on this idea, because running the index steps requires a decent amount of knowledge of how far its got and what osm2pgsql does. We'll also have people run it on an import which died in pending ways, which won't error out, but will not produce a correct result.

lonvia commented 6 years ago

I'm all for making it easy for newbies but not at the expense of making life harder for advanced user than necessary. There are ways to mitigate misuse: mark the option as expert-only and/or add a info line to the output that tells when it is safe to use it. If pending ways does not error out properly then this something that should be fixed on its own, independently of this issue.

pnorman commented 6 years ago

If pending ways does not error out properly then this something that should be fixed on its own, independently of this issue.

It properly errors out, but if osm2pgsql terminates unexpectedly (e.g. as above) the database remains. This is true in all phases, and there's no possible way to clean up when osm2pgsql gets OOM killed. In any stage past the very start, all the tables exist, and the SQL that's run at the end will run without error. It just won't produce sensible output.

molofeev commented 6 years ago

same problem, my osm2pgsql output:

Node stats: total(4235496629), max(5290747187) in 33793s
Way stats: total(458618737), max(547543126) in 1300890s
Relation stats: total(5442482), max(7820612) in 741417s
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using built-in tag processing pipeline
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using built-in tag processing pipeline
Killed

I want run only index mode and don't wont waiting 24 days again I need start import from 95 line in osm2pgsql file, its posible?

ghost commented 6 years ago

These timing are too high and you don't use lua filtering. Anyway, it is needed to exclude useless data. When you run with flatnodes, don't spend the memory needed by Postgres, use the default or less.

eugen-nw commented 4 years ago

How hard would it be to add a "please reindex" switch that ignores any *.pbf file argument and only rebuilds the indices?

soberdor commented 4 years ago

Yes please, this really looks like an all to natural feature - start over the part of the process that fails most frequently, instead of losing hours or even days of successfully performed steps.