trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.35k stars 2.98k forks source link

Support PostgreSQL insert into ON CONFLICT #7897

Open daniele-arosio opened 3 years ago

daniele-arosio commented 3 years ago

Why

To run UPSERT from any catalog to a postgres database. I think it could be a disruptive feature: it would be possible to use Trino to run ETL from data lake or any catalog to postgres, without needs of any ORM.

hashhar commented 3 years ago

If I understand correctly this should be possible to implement in connectors after engine adds supports for MERGE. @djsstarburst ?

See https://github.com/trinodb/trino/pull/7386

djsstarburst commented 3 years ago

If I understand correctly this should be possible to implement in connectors after engine adds supports for MERGE. @djsstarburst ?

I haven't heard of any connection between SQL MERGE and support for INSERT ON CONFLICT in Postgresql.

BTW, #7386 is will be closed, soon to be superseded by an PR with the proper structure of statement analysis and query planning.

hashhar commented 3 years ago

@djsstarburst Thanks for the information. Reading more about MERGE it seems like it's a very broad and more powerful superset of the very limited UPSERT facility (INSERT ... ON CONFLICT) in PostgreSQL.

I was thinking more about pushdown of MERGE into connectors - which is impossible with PostgreSQL because Postgres can't perform everything that MERGE allows. So we can ignore "implement in connectors" part.

However once MERGE gets implemented in engine can't it be used for insert a row if not found, otherwise update?

MERGE INTO dest t USING staging s ON (t.id = s.id)
  WHEN MATCHED THEN
    UPDATE SET value = s.value
  WHEN NOT MATCHED THEN
    INSERT (id, value) VALUES (s.id, s.value);
findepi commented 3 years ago

@hashhar you're right. If MERGE gets somehow supported in PostgreSQL connector, it can be used to the same effect as "PostgreSQL insert into ON CONFLICT ...".