duckdb / duckdb_spatial

MIT License
492 stars 41 forks source link

ST_Collect and/or ST_LineMerge not working as expected #334

Closed jjcfrancisco closed 5 months ago

jjcfrancisco commented 5 months ago

I am looking to join a column of linestrings but I am unable. For comparison (and clarity) this is what I do in PostGIS:

CREATE TABLE st_collect_test (
  id SERIAL PRIMARY KEY,
  geom geometry(LINESTRING,4326)
);

INSERT INTO st_collect_test (geom)
VALUES (ST_GeomFromText('LINESTRING (-2.664082 51.5694421, -2.6652521 51.5694737, -2.6661214 51.569470100000004)')
),
(ST_GeomFromText('LINESTRING (-2.6656155 51.5695713, -2.6648094 51.569612500000005, -2.6640812 51.5696317)')
);

SELECT ST_AsText(ST_LineMerge(ST_Collect(geom)))
FROM st_collect_test;

In DuckDB, if I have a similar table and use this SELECT statement

-- skipping table creation for brevity
SELECT ST_AsText(ST_LineMerge(ST_Collect(geom)))
FROM st_collect_test;

...I get this error:

Binder Error: No function matches the given name and argument types 'ST_Collect(GEOMETRY)'. You might need to add explicit type casts.
        Candidate functions:
        ST_Collect(GEOMETRY[]) -> GEOMETRY

LINE 1: SELECT ST_AsText(ST_LineMerge(ST_Collect(geom))) AS all

I tried to wrap the geom in square brackets but I get all the geometries back (no merging happens)

Interestingly, this works:

SELECT ST_AsText(ST_LineMerge(ST_Collect([
  ST_GeomFromText('LINESTRING (-2.664082 51.5694421, -2.6652521 51.5694737, -2.6661214 51.569470100000004)'),
  ST_GeomFromText('LINESTRING (-2.6656155 51.5695713, -2.6648094 51.569612500000005, -2.6640812 51.5696317)')])))

  -- output:
  --- MULTILINESTRING ((-2.664082 51.5694421, -2.6652521 51.5694737, -2.6661214 51.569470100000004), (-2.6656155 51.5695713, -2.6648094 51.569612500000005, -2.6640812 51.5696317))

But I explicitly want to do it from a column in an existing table.

Maxxen commented 5 months ago

Hi! Thanks for raising this issue!

ST_Collect will collect a single list value of geometries into one geometry. Think of it as collecting horizontally. What you want to do is collect vertically over multiple rows. This doesn't work if you just wrap it in square brackets because essentially you are just creating a single element list for each row. What you want to do is first aggregate multiple rows into a single list and then call st_collect on that. You can use the list() aggregate function to do so:

E.g.

SELECT ST_AsText(ST_LineMerge(ST_Collect(list(geom)))) AS all
jjcfrancisco commented 5 months ago

Hi @Maxxen, this is great.

This is a question for another place but I am intrigued to know why DuckDB (as a choice?) differs slightly to Postgres in the way some queries are formed.

Maxxen commented 5 months ago

In this case it's just because writing an aggregate function to collect an unknown amount of geometries is more difficult than if you just get all of them up front in a list. But we could of course hide this from the user by adding a macro that does the st_collect(list(geom)) internally.

jjcfrancisco commented 5 months ago

I would be intrigued to know what the community at large thinks of this. Subjectively, hiding it by adding a macro it'd remove ambiguity for the user because realistically speaking, a lot of people consult PostGIS' documentation or are familiar with their workflow and expect similar behaviour in DuckDB.