Closed daoleno closed 5 months ago
Thanks for the report! It seems like your connection is very slow or that the rows are very large given that fetching just 10 rows takes 650ms in Postgres. DuckDB does not operate on 10 rows at a time, and instead fetches a lot more data up-front (generally on the order of tens of thousands of rows). Even when executing a LIMIT 10
- this is not pushed into the Postgres scan and a few thousand rows are loaded from Postgres.
Similarly, running ORDER BY/LIMIT
in DuckDB will not use the index that exists in Postgres but instead load the data into DuckDB and run the ORDER BY/LIMIT
there.
If you want to execute a query with a limit or an ORDER BY/LIMIT
in Postgres you can use the postgres_query
function to directly run a query in Postgres and fetch the result, e.g.:
select * from postgres_query('db', 'select * from blocks limit 10');
What happens?
I have a table that has 2M+ rows. It takes ~600 ms to execute the following SQL on a remote machine.
Query Plan
But it takes ~1 min to query in DuckDB attach mode.
And if I query using
ORDER BY
on a field that has an index in PostgreSQL, it almost can't retrieve results (10m+) in DuckDB.To Reproduce
Following the doc at https://duckdb.org/docs/archive/0.9.2/extensions/postgres. Execute the same SQL query using duckdb and psql on a large table
OS:
MacBook Pro(M2)
PostgreSQL Version:
14.2
DuckDB Version:
0.9.2
DuckDB Client:
duckdb executable binary
Full Name:
daoleno
Affiliation:
daoleno
Have you tried this on the latest
main
branch?Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?