openstreetmap / trac-tickets

Archived Trac Tickets
1 stars 1 forks source link

Osmosis crashes when loading planet.osm #4597

Open openstreetmap-trac opened 3 years ago

openstreetmap-trac commented 3 years ago

Reporter: marcelo[at]picoli.com.br [Submitted to the original trac issue database at 10.15am, Thursday, 27th September 2012]

I've been trying to load the full planet.osm data file into postgreSQL for more than two months now... and ran into a completely new problem.

The first time I tried was with a planet.osm from the beginning of August/2012, and it loaded all the nodes, ways and relations, but when it was loading the current tables, a power failure resetted the machine and I was left with only the history tables loaded. Tried loading them by hand since the history tables were the same as the current tables (no changes were applied). Had other problems not related, and gave up.

I decided to download a new version of the planet.osm, this one is from 02/Sep/2012. Unpacked everything, deleted all the tables, and tried:

osmosis -v --read-xml file=g:\OSM\planet-latest.osm --buffer bufferCapacity=100000 --log-progress --write-apidb host=localhost database=postgres user= password=

Osmosis version was 0.40.1, and postgreSQL version was 9.1

It ran for about 6 days, loaded all nodes, and at about 90% of the ways, it crashed. The message for this error was not saved, but the most critical part ("cannot have morre than 2^32-1 commands in a transaction") is the same.

Then, postgreSQL 9.2 was released, and I decided to upgrade both postgreSQL (to 9.2) and osmosis (to 0.41 - the latest at the time). Also got and used the latest schema version. Ended doing a clean postgreSQL install, and tried again with the command:

osmosis -v --fast-read-xml file=g:\OSM\planet-latest.osm --buffer bufferCapacity=100000 --log-progress --write-apidb host=localhost:5433 database=postgres user= password=

After one week, crashed with the same error (the full message is attached):

ERRO: no pode ter mais do que 2^32-1 comandos em uma transao Onde: comando SQL "SELECT 1 FROM ONLY "osm"."ways" x WHERE "way_id" OPERATOR(pg_catalog.=) $1 AND "version" OPERATOR(p g_catalog.=) $2 FOR SHARE OF x"

Translation:

ERROR: Cannot have more than 2^32-1 commands in a transaction Where: SQL command is "SELECT 1 FROM ONLY "osm"."ways" x WHERE "way_id" OPERATOR(pg_catalog.=) $1 AND "version" OPERATOR(p g_catalog.=) $2 FOR SHARE OF x"

Something changed (or grew) beyond some internal limit in postgreSQL and now it is impossible to load the full planet.osm data file.

I do not have the time available to retry loading an older planet.osm and see if the error appears again.

I believe that osmosis shouldn't encapsulate ALL the commands for the FULL planet.osm load in a single transaction (or at least I should have the option of not doing it). I also believe that committing the nodes, then the ways and finally the relations would be only a temporary measure, since the database WILL grow to more than 2^32-1 nodes and (given enough time) to more than 2^32-1 ways. Or, maybe postgreSQL should start using a 64 bit value for the commands in a transaction...

openstreetmap-trac commented 3 years ago

Author: brett[at]bretth.com [Added to the original trac issue at 10.52am, Saturday, 27th October 2012]

I believe the right fix for this is to modify the implementation to import data using the PostgreSQL COPY command. The COPY command is used to load the pgsnapshot schema and is far more efficient. This is not a trivial undertaking and may not get done for some time unless somebody else steps in to assist.

openstreetmap-trac commented 3 years ago

Author: marcelo[at]picoli.com.br [Added to the original trac issue at 10.36am, Monday, 29th October 2012]

I tried to load a PBF planet.osm file from the first days of August, 2012, and the crash did not happen at all. Main tables loaded succesfully, now loading the current tables. So, indeed, there is some kind of limit that was crossed between August, 2012 and September, 2012.

I also believe that the COPY command would not solve the problem, at least not without the use of intermediate TEXT ou CSV files, because of the keys that relate one kind of object to the other. However, I am not a PostgreSQL specialist.

openstreetmap-trac commented 3 years ago

Author: marcelo[at]picoli.com.br [Added to the original trac issue at 8.59pm, Saturday, 22nd December 2012]

Downloaded the source code for osmosis directly from the git repository, and changed the file responsible for the database context:

Compiled successfully, now trying to load a planet.osm file from 11/Dec/2012. Nodes and node tags are appearing instantly on the tables (before the changes, they only appeared after all the history tables were loaded - with the old planet.osm file described before).

I'll post an update in a few days, with the success (or failure) status.

P.S.: I KNOW it is not a good practice to do this, but if it works, I'll be able to try a more refined solution. Nothing was committed to the git repository as of now.

P.P.S.: Performance seems to be noticeably lower than before the changes.

openstreetmap-trac commented 3 years ago

Author: marcelo[at]picoli.com.br [Added to the original trac issue at 1.13am, Friday, 4th January 2013]

Success! The history tables are loaded, and currently a few percent into loading the current tables. Using the same planet.osm that gave me the error before, but no error now.

Therefore it is confirmed that the problem is directly linked to the usage of a single transaction encapsulating all the operations of the loading of the full planet.osm file.

In my opinion, the problem here may be treated in two distinct ways:

openstreetmap-trac commented 3 years ago

Author: marcelo[at]picoli.com.br [Added to the original trac issue at 11.01am, Tuesday, 26th March 2013]

Current tables are fully loaded now (in truth, for more than one month now, OSM was not high in my priorities for the last months), now updating the database, catching up with many months of delay. Using the "stock" version of osmosis (with transactions enabled).

Ran into other (serious) problems with the "populateCurrentTables" option, but this is material for another, separated, ticket.

Therefore no doubt exists that the problem is solvable by the (temporary) disabling of transactions in osmosis through changes in the source code when loading the full planet.osm file.

openstreetmap-trac commented 3 years ago

Author: baditaflorin [Added to the original trac issue at 2.04pm, Tuesday, 15th September 2015]

This bug still exists ?

I am waiting for six days, and i think i will have to wait some more

If i will wait, i would like to know if it will finish or not

http://www.openstreetmap.org/user/baditaflorin/diary/35844#comments

openstreetmap-trac commented 3 years ago

Author: marcelo[at]picoli.com.br [Added to the original trac issue at 2.57pm, Tuesday, 15th September 2015]

Replying to [comment:6 baditaflorin]:

This bug still exists ?

I am waiting for six days, and i think i will have to wait some more

If i will wait, i would like to know if it will finish or not

http://www.openstreetmap.org/user/baditaflorin/diary/35844#comments

Sorry, I do not know if the bug still persists. Work (and family) demands have left me with far too little time available for OSM-related tasks.

openstreetmap-trac commented 3 years ago

Author: scai [Added to the original trac issue at 11.33am, Thursday, 22nd December 2016]

According to https://help.openstreetmap.org/questions/53591/error-occurs-when-use-osmosis-command-to-import-planet-latestosmpbf-file the problem still exists with the latest version 0.45.