CMASCenter / Spatial-Allocator

Surrogate Tools, Raster Tools, and Vector Tools Open Source GIS
27 stars 16 forks source link

Notes on shapefile loading #3

Open cseppan opened 7 years ago

cseppan commented 7 years ago

I created shapefiles from the existing surrogate processing that contain just the columns needed and have the geometries already converted to the EPA's grid projection. These are in /proj/ie/proj/SA/pg_srgcreate/shp_export/.

The first step to loading these shapefiles is to define the grid projection in the spatial_ref_sys table in the database.

INSERT INTO spatial_ref_sys (srid, srtext, proj4text) VALUES
(900921, 'PROJCS["Lambert_Conformal_Conic",GEOGCS["GCS_Sphere_WRF",DATUM["Sphere_WRF",SPHEROID["Sphere_WRF",6370000.0,0.0]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Lambert_Conformal_Conic_2SP"],PARAMETER
 ["false_easting",0.0],PARAMETER["false_northing",0.0],PARAMETER["central_meridian",-97.0],PARAMETER["standard_parallel_1",33.0],PARAMETER["standard_parallel_2",45.0],PARAMETER["latitude_of_origin",40.0],UNIT["Meter",1.0"]]', 
'+proj=lcc +lat_1=33 +lat_2=45 +lat_0=40 +lon_0=-97 +x_0=0 +y_0=0 +a=6370000 +b=6370000 +units=m +no_defs');

Or from the command line:

psql -h myserver -d mydb -U myuser -f create_900921.sql

To load a shapefile into a Postgres table, use the shp2pgsql command:

shp2pgsql -s 900921 -g geom_900921 -D -I ./pil2016_16aug.shp public.pil2016_16aug | psql -h myserver -d mydb -U myuser

The flag -s 900921 indicates that the geometries inside the pil2016_16aug shapefile use the projection corresponding to SRID 900921 in the spatial_ref_sys table.

-g geom_900921 specifies the name of the geometry column in the created database table.

-D uses the Postgres data dump format.

-I creates an index on the geometry column after the data is imported.

./pil2016_16aug.shp is the shapefile to read.

public.pil2016_16aug is the schema and table name to create.

psql -h myserver -d mydb -U myuser is the connection information for the Postgres server.

Generic shapefile import

When working with other shapefiles, the ogr2ogr tool may be more useful than shp2pgsql. It has more options than shp2pgsql, including automatically determining the projection based on the shapefile's .prj file.

ogr2ogr -f "PostgreSQL" "PG:dbname=mydb" shapefile.shp -nln schema.table -nlt PROMOTE_TO_MULTI

Shapefiles can have geometries that PostGIS considers invalid. To fix these geometries, use the ST_MakeValid function.

UPDATE schema.table 
   SET wkb_geometry = ST_MakeValid(wkb_geometry) 
 WHERE NOT ST_IsValid(wkb_geometry);

After the shapefile has been loaded, a new geometry column corresponding to the grid's projection needs to be created. For example, assuming the grid SRID is 900921:

ALTER TABLE schema.table ADD COLUMN geom_900921 geometry(geomtype, 900921);
CREATE INDEX ON schema.table USING GIST (geom_900921);
UPDATE schema.table SET geom_900921 = ST_Transform(wkb_geometry, 900921);

where geomtype = MultiPolygon, Point, etc. matching the original geometry column.

CMASCenter commented 7 years ago

I've run through these steps, adding the exported data to the database and also some new data. I've scripted some of these steps here: /proj/ie/proj/EMAQ/Platform/Surrogates/2014/Spatial-Allocator/pg_srgcreate.

I'll get a hold of the java tools and give that a try for creating the surrogates.

cseppan commented 7 years ago

I've added newly exported shapefiles to /proj/ie/proj/SA/pg_srgcreate/shp_export/:

For the pop/housing shapefile, I wasn't sure which fields are needed, so I exported everything. I'm not sure if the multiple geometry columns will cause problems or if the extras just get imported as strings.

CMASCenter commented 7 years ago

For the fix geometry steps, how do you know if PostGIS considers the geometries to be invalid? and does this need to be done for all shapefiles?

Another question is when you say, "a new geometry column corresponding to the grid's projection needs to be created", you mean the modeling grid projection, not the shapefile projection, right?

cseppan commented 7 years ago

To check for invalid geometries, you can run a query like:

SELECT COUNT(*) FROM schema.table
 WHERE NOT ST_IsValid(wkb_geometry);

If the count comes back greater than zero, there are invalid geometries.

Regarding the new geometry column, that refers to transforming the geometry from the shapefile's original projection into the output modeling grid projection.

joellenb commented 7 years ago

A table can have multiple geometry columns, each of which is in a different projection. If the geometries in the first projection are all made valid, and then those are projected and saved to a new column, those geometries are also valid.

CMASCenter commented 7 years ago

Is there anyway to generalize the weight processing script to run on any shapefile:

/proj/ie/proj/EMAQ/Platform/Surrogates/2014/Spatial-Allocator/pg_srgcreate/PGgrid_12km_scripts/acs_2014_prcs_script.sh.

This script has all kinds of hard coded attributes. Can we just pass attributes along from the input "uncut" file to allow this script to be extensible?