duckdb / pg_duckdb

DuckDB-powered Postgres for high performance apps & analytics.
MIT License
1.29k stars 48 forks source link

Copy ... FROM remote S3 storage #30

Open mkaruza opened 4 months ago

mkaruza commented 4 months ago

We should enable COPY command to be able to write to remote S3 storage.

Copying TO remote storage should be possible by passing query directly to duckdb execution while in other direction (FROM) more complex logic will be needed.

mkaruza commented 4 months ago

COPY .. TO functionality done with PR #32.

Mytherin commented 4 months ago

COPY .. FROM is more challenging - we will pick this up in the future

wuputah commented 3 months ago

There's a number of related ideas, all of which are a form of "read from duckdb, write to postgres" e.g.:

  1. INSERT INTO table SELECT /* read executed by duckdb */
  2. CREATE TABLE AS SELECT /* read executed by duckdb */
  3. CREATE MATERIALIZED VIEW matview AS SELECT /* executed by duckdb, even on refresh */

COPY ... FROM (query) does not exist in the PostgreSQL syntax. However, we could probably support a filename of 's3://...' to implicitly do what the user wants, but the functionality here is somewhat limited.

Is there a particular version of this that might be easiest or best to tackle first? I suspect this issue should probably be replaced by #1 above.