openstreetmap / openstreetmap-website

The Rails application that powers OpenStreetMap
https://www.openstreetmap.org/
GNU General Public License v2.0
2.15k stars 908 forks source link

Create a rake task for populating the database #282

Open gravitystorm opened 11 years ago

gravitystorm commented 11 years ago

Currently the installation notes spend about 90 lines explaining how to populate the database from an extract. A substantial portion of this concerns osmosis options and resetting sequences.

This could be simplified by creating a rake task that calls osmosis with the configured database parameters and takes care of the sequences.

tomhughes commented 11 years ago

Well populating the database should very much be relegated to an "optional extras" section of the documentation - most people just looking to install the rails code in order to develop it should never need to to do any sort of import.

So I don't object to this idea, but really the easiest solution for most people is not to bother, which also means you don't have to explain how to install osmosis, making things even simpler!

gravitystorm commented 11 years ago

Oh indeed - I'm just aiming to reduce the length of the 'optional extras' section too!

pnorman commented 11 years ago

Once the schema is set up it is "just" one osmosis command to load in data. Perhaps just redirect to the osmosis install instructions and give the command?

gravitystorm commented 11 years ago

The original instructions have lots of resetting of sequences - does osmosis now handle all of these?

pnorman commented 11 years ago

That's for loading data then editing it locally, which is more specialized. If all you need to do is load some data to have something to view, it's pretty easy. Perhaps have a rake task to set sequences to MAX(id) or whatever is appropriate, and run the task after loading data?

danstowell commented 10 years ago

Just to note that I'm currently hoping to try some tweaks to the /browse/ pages, which do require a data import, or else there are no items to browse. (Especially since, by default, the search box's results come from remote.)

At present there is no documentation (since the doc has gone from wiki into the CONFIG.md, which tells me literally nothing except this issue number), and no script. We are stuck! Please, something... (even if it's just a link to deprecated wiki instructions...)

tomhughes commented 10 years ago

@danstowell I usually just jump into the editor and draw something when I want to test something like that - that way I can be sure it will have the tags I need for whatever I'm testing.

But sure, I'm not going to refuse a patch that provides (sensible) instructions on how to load some data. I believe it is a non-trivial thing to do however, so they will need to be well written.

pnorman commented 10 years ago

Just to note that I'm currently hoping to try some tweaks to the /browse/ pages, which do require a data import, or else there are no items to browse. (Especially since, by default, the search box's results come from remote.)

Are you wanting to import data, or import then edit data? The first is relatively easy - create the database then use the osmosis --write-apidb task to import data (keeping in mind that --write-apidb is amazingly slow). To do it properly (not giving superuser to every postgres account involved) can be a bit annoying.

If the latter, then you need to muck about with sequences so when you add a node it gets a suitable ID.

Having set up an apidb database a few times for testing, I can say that it was never properly documented anywhere.

danstowell commented 10 years ago

I only need to import data. (I need fairly rich data, so drawing a few ways myself isn't quite enough for me.)

Just for the record (for anyone drafting a rake task!), here's what I've done which seems to have given me a usable read copy of the apidb:

This does not reset the sequences, as mentioned above in this thread. However it doesn't seem to need any weird privileges in the postgres user accounts. But then, since I don't know the database internals I don't know if the sequences issue is the only quirk of the (development) database I've landed myself with.

msergiu80 commented 10 years ago

Hi everyone, I am pretty new in this so sorry if I ask dumb questions. I setup a tileserver on a machine and I would like to edit that map and not the Open Street Map, with the Rails Port. How can I do that? I saw that the application.yml is pointing to the url www.openstreetmap.org so is there any way I can edit my map that's at www.embedded-systems.ro:1981 ? Thank you in advance.

tomhughes commented 10 years ago

@msergiu80 Your comment has nothing to do with this thread and in any case is not appropriate here - please ask on the dev or rails-dev mailing lists, or on #osm-dev on IRC if you need help using the code,

msergiu80 commented 10 years ago

@tomhughes Hi Tom, I tested the installation of the "rails port" until the point of "adding geographical data" when a yet-to-be-written script link was there to click on :) I thought the question was in the right place as I would like to populate my database with a map that I can work on from the web interface editor. I'll try to get help where you've pointed me :) Thanks.

hanchao commented 7 years ago

@danstowell Error occurred. Any help?

[hanchao@map ~]$ osmosis --truncate-apidb host="******" database="******" user="******" password="******" validateSchemaVersion="no"
九月 19, 2016 5:14:58 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Osmosis Version 0.45
九月 19, 2016 5:14:58 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Preparing pipeline.
九月 19, 2016 5:14:58 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Launching pipeline execution.
九月 19, 2016 5:14:58 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Pipeline executing, waiting for completion.
九月 19, 2016 5:14:59 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Pipeline complete.
九月 19, 2016 5:14:59 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Total execution time: 674 milliseconds.

[hanchao@map ~]$ osmosis --read-pbf file=china-latest.osm.pbf --write-apidb host="******" database="******" user="******" password="****" validateSchemaVersion="no"
九月 19, 2016 5:15:11 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Osmosis Version 0.45
九月 19, 2016 5:15:11 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Preparing pipeline.
九月 19, 2016 5:15:11 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Launching pipeline execution.
九月 19, 2016 5:15:11 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Pipeline executing, waiting for completion.
九月 19, 2016 5:16:18 下午 org.openstreetmap.osmosis.core.pipeline.common.ActiveTaskManager waitForCompletion
严重: Thread for task 1-read-pbf failed
org.openstreetmap.osmosis.core.OsmosisRuntimeException: Unable to insert user with id 3642735 into the database.
        at org.openstreetmap.osmosis.apidb.v0_6.impl.UserManager.insertUser(UserManager.java:143)
        at org.openstreetmap.osmosis.apidb.v0_6.impl.UserManager.addOrUpdateUser(UserManager.java:191)
        at org.openstreetmap.osmosis.apidb.v0_6.ApidbWriter.process(ApidbWriter.java:1098)
        at crosby.binary.osmosis.OsmosisBinaryParser.parseDense(OsmosisBinaryParser.java:138)
        at org.openstreetmap.osmosis.osmbinary.BinaryParser.parse(BinaryParser.java:124)
        at org.openstreetmap.osmosis.osmbinary.BinaryParser.handleBlock(BinaryParser.java:68)
        at org.openstreetmap.osmosis.osmbinary.file.FileBlock.process(FileBlock.java:135)
        at org.openstreetmap.osmosis.osmbinary.file.BlockInputStream.process(BlockInputStream.java:34)
        at crosby.binary.osmosis.OsmosisReader.run(OsmosisReader.java:45)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "users_display_name_idx"
  详细:Key (display_name)=(Nodes&Roads) already exists.
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2103)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1836)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:512)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
        at org.openstreetmap.osmosis.apidb.v0_6.impl.UserManager.insertUser(UserManager.java:140)
        ... 9 more

九月 19, 2016 5:16:18 下午 org.openstreetmap.osmosis.core.Osmosis main
严重: Execution aborted.
org.openstreetmap.osmosis.core.OsmosisRuntimeException: One or more tasks failed.
        at org.openstreetmap.osmosis.core.pipeline.common.Pipeline.waitForCompletion(Pipeline.java:146)
        at org.openstreetmap.osmosis.core.Osmosis.run(Osmosis.java:92)
        at org.openstreetmap.osmosis.core.Osmosis.main(Osmosis.java:37)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launchStandard(Launcher.java:330)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:238)
        at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
        at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
        at org.codehaus.classworlds.Launcher.main(Launcher.java:47)
mojodna commented 7 years ago

@hanchao this is the root cause: https://github.com/drolbr/Overpass-API/issues/257 (even if it didn't come from Overpass).

Here's a shell script that will remap user IDs after the extract has been converted to XML: https://github.com/AmericanRedCross/posm-replay-tool/blob/c25d8e1f62af44e0664190723eba51eef7b93adc/remap-userid.sh

My notes in https://github.com/AmericanRedCross/posm-replay-tool/blob/28deac4193859b71af34d845f013194a37871870/LOCAL.md#initialization explain what's going on and steps to fix it (locally).

hanchao commented 7 years ago

@mojodna Thanks. This helped greatly。

3642735 1708958 have the same name (Nodes&Roads)

<node id="1546831475" version="2" timestamp="2016-08-31T17:30:44Z" uid="3642735" user="Nodes&amp;Roads" changeset="41831739" lat="34.6455964" lon="110.3129604"/>
<node id="1582851239" version="3" timestamp="2016-02-18T20:41:39Z" uid="1708958" user="Nodes&amp;Roads" changeset="37297109" lat="38.9261979" lon="113.8765729"/>

https://www.openstreetmap.org/api/0.6/node/1582851239

<node id="1582851239" visible="true" version="3" changeset="37297109" timestamp="2016-02-18T20:41:39Z" user="georhoko" uid="1708958" lat="38.9261979" lon="113.8765729"/>

The name in china-latest.osm.pbf is not updated http://download.geofabrik.de/asia/china-latest.osm.pbf

tomhughes commented 7 years ago

@maxdeepfield Your comment has nothing to do with this thread and in any case is not appropriate here - please ask on the dev or rails-dev mailing lists, or on #osm-dev on IRC if you need help using the code.

maxdeepfield commented 4 years ago

reading "about 90 lines explaining how to populate the database" is not very hard task if these lines exists and will work. any progress on this?

maxdeepfield commented 4 years ago

Anyway, osmosis --write-apidb works fine with geofabrik extracts, after import I can see and edit data, then get the result via osmosis --read-apidb-current, what with --write-pbf gives new dataset.

Also it works with xml/osm files exported directly from openstreetmap website via bounding box, so it is super easy to test on "real" data.

If processes done without errors - do I need to care about something? How actually I can feel not to "able to edit the data I have loaded"?

prusswan commented 4 years ago

Managed to import my region extract from geofabrik after resolving various data issues, which can be resolved with some database knowledge and background understanding (e.g. data extracts need to satisfy some integrity conditions) of these issues #1988, #2449, #2543

Anyway, I feel the real issue now may be that users don't realize "populating the database from an OSM extract" requires the data to be imported to meet certain conditions, for the step to "just work" (and possibly for the proposed rake task to run without getting tripped). Would it help to have another section covering the kind of data preparation which may be required?

victorovento commented 2 years ago

Well I think this script will be never written.

rkoeze commented 2 months ago

@gravitystorm it seems like some combination of osmosis with the clipIncompleteEntities=true argument and then a bunch of ALTER SEQUENCE tablename_id_seq RESTART WITH n; sql commands to update the sequences might work here. Are you still interested in this rake task being written?

I suppose one disadvantage is that it's just another thing to maintain (for example every time a new sequence is introduced this command will have to be updated). An alternative might be updating the the docs with general instructions for importing referentially complete entities but then let developers do the work themselves.

If you do still think this would be worthwhile I'd be happy to put up a draft PR for feedback. Thanks!

rkoeze commented 1 month ago

@tomhughes wanted to ask if you have any feedback on my comment above as well. Meant to @ you originally. Thanks in advance for your time.