josephramsay / lds_replicate

Replication scripts for LDS (LINZ Data Service).
http://data.linz.govt.nz/
2 stars 1 forks source link

Create unique constraint for PostgreSQL primary key columns #15

Open palmerj opened 11 years ago

palmerj commented 11 years ago

See here http://github.com/josephramsay/LDS/blob/master/LDSReplicate/lds/PostgreSQLDataStore.py#L138

Should be creating a unique constraint e.g "ALTER TABLE xxx.xxx ADD CONSTRAINT UNIQUE(id)", rather than a plain index. This will create a unique index that ensures duplicate key values can not exist in the table.

Same logic should be applied to the MSSQL code as well.

josephramsay commented 11 years ago

A couple of problems here;

  1. Not all the layers have primary keys and we cannot create a unique constraint on the geometry column since "RuntimeError: ERROR: index row requires 14504 bytes, maximum size is 8191".
  2. We can't use functions in a unique constraint statement so ALTER TABLE X ADD CONSTRAINT X_pk UNIQUE(st_geohash(...)) returns an error
  3. GIST indices cannot be unique. "ERROR: access method "gist" does not support unique indexes"

A solution that seems to work in testing is to create a unique index based on a geohash of the feature. (These don't always seem to be unique either but if we concat with an id field

create unique index nzp_temp_i on nzp_temp (lower(id||st_geohash(st_transform(st_centroid(shape),4167)))); NB. concat(x,y) doesn't work as an index function citing some problem with immutability though || works fine

This should work to create unique condition for all/most tables... though there is a small performance hit