osm-search / Nominatim

Open Source search based on OpenStreetMap data
https://nominatim.org
GNU General Public License v3.0
3.22k stars 716 forks source link

Getting duplicate key violation... #468

Closed SuberFu closed 8 years ago

SuberFu commented 8 years ago

I'm having trouble with the updating a North America import. It's giving me this error when executing ./utils/update.php --import-osmosis --no-npi

COPY_END for place failed: ERROR: duplicate key value violates unique constraint "idx_place_osm_unique" DETAIL: Key (osm_id, osm_type, class, type)=(40385299, N, highway, motorway_junction) already exists.

There was a previous update that failed due to erroneous postgis version (needing a function that didn't exist). Could that have caused this?

These are my versions: Postgresql 9.5 Postgis 2.2.2 osmosis 0.45 Nominatim 2.5.0

P.S. If that's a case, is there a procedure to, say, create a diff (.osc) file by comparing the postgresql database and a planet (osm) file and then using that to import to prevent the above "duplicate key" issues?

lonvia commented 8 years ago

This should not happen even after a previous update errored out. Nominatim is normally able to roll back the changes correctly.

Some more info would be helpful. Please provide the output of psql -d nominatim -c "select * from place where osm_type = 'N' and osm_id = 40385299"and the entry for node 40385299 from the current osc (data/osmosischange.osc). For the latter, please also check that the node does not appear multiple times in your file.

SuberFu commented 8 years ago

I just retried reinitializing osmosis (delete configuration, state and the osc file before running osmosis init) and got the same error on different key. Below is the full error.

/home/centos/Nominatim-2.5.0/osm2pgsql/osm2pgsql -klas --number-processes 1 -C 2000 -O gazetteer -d nominatim -P 5432 /home/centos/Nominatim-2.5.0/data/osmosischange.osc
osm2pgsql SVN version 0.89.0-dev (64bit id space)

Using projection SRS 4326 (Latlong)
Allocating memory for dense node cache
Allocating dense node cache in one big chunk
Allocating memory for sparse node cache
Sharing dense sparse
Node-cache: cache=2000MB, maxblocks=256000*8192, allocation method=11
Mid: pgsql, scale=10000000 cache=2000
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels

Reading in file: /home/centos/Nominatim-2.5.0/data/osmosischange.osc
Using XML parser.
COPY_END for place failed: ERROR:  duplicate key value violates unique constraint "idx_place_osm_unique"
DETAIL:  Key (osm_id, osm_type, class, type)=(36395349, N, tourism, attraction) already exists.
CONTEXT:  COPY place, line 1

Error occurred, cleaning up
Error: 1

This is the select statement and result.

nominatim=# select * from place where osm_id=36395349 and osm_type='N' and class='tourism' and type='attraction';
 osm_type |  osm_id  |  class  |    type    |                        name                         | admin_level | housenumber | street | addr_place | isin | postcode | country_code | extratags |                      geometry
----------+----------+---------+------------+-----------------------------------------------------+-------------+-------------+--------+------------+------+----------+--------------+-----------+----------------------------------------------------
 N        | 36395349 | tourism | attraction | "name"=>"Headwaters (nominal source of L.A. River)" |         100 |             |        |            |      |          |              |           | 0101000020E6100000D4B6611484A65DC031EC3026FD184140
(1 row)

This is the osc entry (there're no duplicate entries). I included the root xml tag, the relevant tag is the second <node> (<node id="36395349" version="26" timestamp="2016-06-03T09:36:42Z" uid="703694" user="Clarke22" changeset="39774631" lat="34.1952225" lon="-118.6018548">

 <modify>
    <node id="36299274" version="5" timestamp="2016-06-03T18:40:00Z" uid="1213011" user="Werewombat" changeset="39785355" lat="45.2361492" lon="-120.1848151"/>
    <node id="36395349" version="26" timestamp="2016-06-03T09:36:42Z" uid="703694" user="Clarke22" changeset="39774631" lat="34.1952225" lon="-118.6018548">
      <tag k="name" v="Headwaters (nominal source of L.A. River)"/>
      <tag k="note" v="Confluence of the LA River headstreams (Arroyo Calabasas and Bell Creek).  This is the uppermost place where the name &apos;Los Angeles River&apos; is used."/>
      <tag k="tourism" v="attraction"/>
      <tag k="waterway" v="source"/>
    </node>
    <node id="37903135" version="3" timestamp="2016-06-03T03:42:28Z" uid="3125856" user="cowdog" changeset="39769221" lat="45.8807396" lon="-122.8894359"/>
    <node id="37903139" version="3" timestamp="2016-06-03T03:42:28Z" uid="3125856" user="cowdog" changeset="39769221" lat="45.8808558" lon="-122.8874615"/>
  </modify>

Could it be that osmosis now support a "modify" tag? As such that the osm2pgsql included in 2.5.0 is not handling correctly?

Thank you.

lonvia commented 8 years ago

Nothing unusual there. I wonder if you accidentally lost all triggers on the place table. This is what you should see:

me@machine:~$ psql nominatim
psql (9.4.8)
Type "help" for help.

nominatim=# \d place
                Table "public.place"
[...]
Triggers:
    place_before_delete BEFORE DELETE ON place FOR EACH ROW EXECUTE PROCEDURE place_delete()
    place_before_insert BEFORE INSERT ON place FOR EACH ROW EXECUTE PROCEDURE place_insert()

nominatim=# 

If there is no trigger section, reinstall all functions like this:

./utils/setup.php --create-functions --create-partition-functions --enable-diff-updates

Then check again for the triggers. If there is still nothing, check the error logs of postgresql.

SuberFu commented 8 years ago

I'm still missing trigger after running that function. There's no error.log in postgresql from running setup.php.

EDIT: Running select * from pg_trigger shows no triggers in the database (the trigger list is empty).

Btw, I got the Nominatim code from this link. http://www.nominatim.org/release/Nominatim-2.5.0.tar.bz2

SuberFu commented 8 years ago

Thank you for pointing to the potential missing triggers. Solved it by copying the follow block of code from sql/table.sql

-- insert creates the location tagbles, creates location indexes if indexed == true
CREATE TRIGGER placex_before_insert BEFORE INSERT ON placex
    FOR EACH ROW EXECUTE PROCEDURE placex_insert();

-- update insert creates the location tables
CREATE TRIGGER placex_before_update BEFORE UPDATE ON placex
    FOR EACH ROW EXECUTE PROCEDURE placex_update();

-- diff update triggers
CREATE TRIGGER placex_before_delete AFTER DELETE ON placex
    FOR EACH ROW EXECUTE PROCEDURE placex_delete();
CREATE TRIGGER place_before_delete BEFORE DELETE ON place
    FOR EACH ROW EXECUTE PROCEDURE place_delete();
CREATE TRIGGER place_before_insert BEFORE INSERT ON place
    FOR EACH ROW EXECUTE PROCEDURE place_insert();

And execute it directly on the database. I'm not sure if that truly solves it, but at least it's now at the Processing: Node(20k 0.3k/s) Way(0k 0.00k/s) Relation(0 0.00/s) stage.

lonvia commented 8 years ago

Right, sorry, the triggers are created only with the tables. Glad to see you found the right SQL commands. In theory, this should fix your issue. In practice, it's hard to say what else broke when the trigger got deleted (or rather I suspect that the trigger creation was simply skipped). So, you might have to reinstall your database at some point, if the data looks too odd.

If this was really due to a bad postgis version in your configuration, then that issue is already fixed on master because newer versions determine the postgres and postgis version automatically. So, closing here. Please open a new ticket should you find a reproducible way to get the database in this messed up state.