josephramsay / lds_replicate

Replication scripts for LDS (LINZ Data Service).
http://data.linz.govt.nz/
2 stars 1 forks source link

Avoid column deletions on destination #7

Closed josephramsay closed 11 years ago

josephramsay commented 11 years ago

Since DB column deletes leave columns intact but hidden (until vacuum) we should avoid performing these operation on client destination tables.

  1. Replicate using featureCopy() only. Speed is an issue here since F-b-F is order-of-magnitude slower. 772 would take days to replicate.
  2. Employ intermediate memory resident DS. Depending on the size of the layer being replicated memory overruns are possible. Page size modifications might help here.
josephramsay commented 11 years ago

Using a Memory datasource addresses this but makes the conversion operation memory dependent and negates some of the benefits of pagination. File datasources have other issues including edit-ability and data truncation

Results of testing. 1.Driver copy using memory datasource; fastest but gives out-of-memory exceptions on large datasets. This is probably only an issue on memory constrained systems.

  1. Driver copy using GMT/GeoJSON/Mapinfo temporary datasource. These interim types don't allow structural editing so columns cannot be deleted.
  2. Driver copy using ESRI Shapefile temporary datasource; attribute names get truncated which would be a problem for subsequent updates i.e. incremental
  3. Driver copy using AutoCAD DXF ; Error returned, "DXF layer does not support arbitrary field creation"... ugh
  4. Feature copy; slow with assorted errors on large datasets (estimate 772 would take up to 3 days to replicate and has never succeeded)
  5. Driver copy allowing column deletion; Success. 1 hour to complete 772. Increases destination table size temporarily

This selection is left to the user who can modify the MISC/temptable parameter in the main/user config file

NB. 772 = NZ Primary Parcels

josephramsay commented 11 years ago

Size/Speed test comparison v:x783 (5648 rows) : Memory DS - 2328kB/5.0s, Direct DS - 2504kB/6.9s v:x794 (1040424 rows) : Memory DS - 313MB/11m52s, Direct DS - 344MB/11m47s

select pg_size_pretty(pg_total_relation_size('nz_survey_plans'));