dimitri / pgloader

Migrate to PostgreSQL in a single command!
http://pgloader.io
Other
5.42k stars 546 forks source link

Postgis column loader #812

Closed pcreso closed 6 years ago

pcreso commented 6 years ago

This is a howto request, rather than bug report or feature request.

Postgis is a widely used spatial data management extension for Postgresql. I would like to be able to use pgloader to load data into Postgis geometry columns in a Postgres table.

A simple use case would be to load a CSV with X (longitude) and Y (latitude) values into a Postgis POINT geometry column. Using SQL, the approach taken is to take the two ordinates, generate a point geometry value from them, then specify the SRID (coordinate system) used for the point. Generally the SRID is identified using the EPSG code, as Postgis provides a lookup capability based on these codes.

if I read data in (say from CSV):

1,tan0104,1,173.567,-43.678

to a Postgis table:

create table station
(id         int     primary key,
 trip       char(7)
 station_no int,
 lon_s      decimal(7,4),
 lat_s      decimal(6,4),
 startp     geometry(POINT,4326));

the startp column is populated by the SQL:

startp=ST_SetSRID(ST_MakePoint(lon_s,lat_s),4326)

This creates a point feature from the coordinates, and specifies the coordinate reference system as lat/lon degrees WGS84 (EPSG:4326) to match the column specification then it was created.

Thanks,

Brent Wood

dimitri commented 6 years ago

For an example that does something similar, have a look at the following test case: https://github.com/dimitri/pgloader/blob/master/test/csv-districts.load. The following syntax allows you to inject new data from what you read in the CSV file, combining two columns in a single one the way you like:

     TARGET COLUMNS
      (
         usps, geoid, aland, awater, aland_sqmi, awater_sqmi,
         location point using (format nil "(~a,~a)" intptlong intptlat)
      )

So in your case, it would look like:

     TARGET COLUMNS
      (
         id, trip, station_no, lon_s, lat_s,
         startp geometry using (format nil "ST_SetSRID(ST_MakePoint(~a,~a),4326)" lon_s lat_s)
      )
bsusensjackson commented 6 years ago

Hello,

I am currently trying to run something similar but am receiving:

CONTEXT: COPY provider, line 1, column point: "ST_SetSRID(ST_MakePoint(0, 0), 4326)"
2018-08-06T22:24:42.415000+01:00 ERROR PostgreSQL ["\"public\".\"locations\""] Database error XX000: parse error - invalid geometry
HINT: "ST" <-- parse error at position 2 within geometry

In my config file I have:

 point geometry using (format nil "ST_SetSRID(ST_MakePoint(~a, ~a), 4326)" lat lng),

Using:

pgloader version "3.5.5ca3ee8"

Thanks.

dimitri commented 6 years ago

Sorry about my earlier comment, it was wrong. The transformation function is executed from within pgloader, and the PostGIS function can only be executed once the data is in PostgreSQL. The solution looks like loading the data as PostgreSQL point and then using PostGIS specific functions to turn that data into a geometry column, with an SQL UPDATE statement.

Uto0op commented 6 months ago

In my case, I'm trying to load a csv column which is a list of coordinates, e.g. [[-8.639847,41.159826],[-8.640351,41.159871],[-8.642196,41.160114],[-8.644455,41.160492],...], into Postgres.

I've tried something like create a target geometry column in postgres: polyline GEOMETRY(LINESTRING, 4326)

and in the .load file, target this geometry column by TARGET TABLE geolite.location ( polyline geometry(LineString, 4326) -- a list of coordinates of WGS84 format ), which failed with error message: `KABOOM! ESRAP-PARSE-ERROR: At

              timestamp,day_type,missing_value,
              polyline geometry(LineString, 43
                      ^ (Line 29, Column 24, Position 906)

In context CSV-TARGET-COLUMN:

While parsing KW-USING. Expected:

 the character Tab

or the character Newline or the character Return or the character Space or the string "--" or the string "/*" or the string "using"

While parsing CSV-BARE-FIELD-NAME. Problem:

The production

#\(

does not satisfy the predicate ALPHA-CHAR-P.

Expected:

 the character $ (DOLLAR_SIGN)

or the character . (FULLSTOP) or the character (LOW_LINE) or any character satisfying DIGIT-CHAR-P or any character satisfying ALPHA-CHAR-P An unhandled error condition has been signalled: At

              timestamp,day_type,missing_value,
              polyline geometry(LineString, 43
                      ^ (Line 29, Column 24, Position 906)

In context CSV-TARGET-COLUMN:

While parsing KW-USING. Expected:

 the character Tab

or the character Newline or the character Return or the character Space or the string "--" or the string "/*" or the string "using"

While parsing CSV-BARE-FIELD-NAME. Problem:

The production

#\(

does not satisfy the predicate ALPHA-CHAR-P.

Expected:

 the character $ (DOLLAR_SIGN)

or the character . (FULLSTOP) or the character (LOW_LINE) or any character satisfying DIGIT-CHAR-P or any character satisfying ALPHA-CHAR-P

What I am doing here?

At

              timestamp,day_type,missing_value,
              polyline geometry(LineString, 43
                      ^ (Line 29, Column 24, Position 906)

In context CSV-TARGET-COLUMN:

While parsing KW-USING. Expected:

 the character Tab

or the character Newline or the character Return or the character Space or the string "--" or the string "/*" or the string "using"

While parsing CSV-BARE-FIELD-NAME. Problem:

The production

#\(

does not satisfy the predicate ALPHA-CHAR-P.

Expected:

 the character $ (DOLLAR_SIGN)

or the character . (FULLSTOP) or the character (LOW_LINE) or any character satisfying DIGIT-CHAR-P or any character satisfying ALPHA-CHAR-P`

Can I get some help?