Open anayrat opened 7 months ago
Thanks for the report! Parallel imports are not supported right now. It is also rather hard (and perhaps not even possible?) to support parallel imports in a transaction-safe manner. We could perhaps offer a bulk-import option that does not respect transactions in the future.
Thanks for your quick reply ! Indeed, it is a good idea for the bulk import without taking care of transaction. FYI, it seems possible to perform the copy thanks to snapshot export :
Session 1 | Session 2 |
---|---|
BEGIN; | |
SELECT pg_export_snapshot(); pg_export_snapshot --------------------- 00000003-00000016-1 (1 row) |
|
copy t1 from 'file.copy' ; COPY 10 |
BEGIN ISOLATION LEVEL REPEATABLE READ; |
SET TRANSACTION SNAPSHOT '00000003-00000016-1'; | |
copy t1 from 'file.copy' ; COPY 10 |
|
COMMIT; | |
COMMIT; | |
select count(*) from votes2; count ------- 20 (1 row) |
With such feature, DBT could be a very good tool to load massive amount of data in Postgres. It could be my new favorite ELT/ETL :)
What happens?
Hello, I tried to do a parallel import in Postgres. I wonder if postgres_scanner allows this ?
To Reproduce
I tried :
I tried to change file format (csv, parquet), compression, multiples files... I also tried to first import in duckdb, then, copy to Postgres. I only see one COPY process on Postgres.
OS:
Linux
PostgreSQL Version:
16
DuckDB Version:
v0.10.0 20b1486d11
DuckDB Client:
CLI
Full Name:
Adrien Nayrat
Affiliation:
Self Employed
Have you tried this on the latest
main
branch?Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?