duckdb / duckdb_delta

DuckDB extension for Delta Lake
MIT License
121 stars 14 forks source link

TPCH Query 9 wrong Query Plan #1

Closed djouallah closed 2 weeks ago

djouallah commented 4 months ago

it seems duckdb delta (0.10.3.dev1012, as i could not installed the latest nightly build ) is generating a less optimized plan for Query 9

see example https://colab.research.google.com/drive/1_azbqz7v7_VVfpk4qPhcIRtvxNTquXfL#scrollTo=mAsEG6aOqTuD

image
samansmink commented 4 months ago

Hi @djouallah thanks for reporting! both column statistics and cardinality estimation for delta are not yet implemented, so implementing those should probably resolve this issue. If not, I will take a deeper look at this

djouallah commented 4 months ago

is this the right way to install delta

duckdb.sql(" SET custom_extension_repository = 'http://nightly-extensions.duckdb.org'; ")
duckdb.sql(" install delta")
duckdb.sql(" load delta")
samansmink commented 4 months ago

More docs are on the way, but this will work now:

INSTALL delta

to overwrite an already install delta extension:

FORCE INSTALL delta

to install the nightly build there's an alias now:

INSTALL delta FROM core_nightly

However, note that the delta extension is now autoloadable, so just typing:

from delta_scan("....")

should also work

samansmink commented 4 months ago

Also fyi: https://github.com/duckdb/duckdb_delta/issues/8

djouallah commented 2 weeks ago

it is way better now thanks