Closed lyogev closed 2 months ago
I would love to help fix this. Any help on where to start would be highly appreciated.
Hey @lyogev thanks for the report! I've done some initial benchmarking but we haven't done any thorough optimizations yet. There might be some cases where delta could be a little slower than straight up parquet.
Could you share some details on your dataset and the precise query you are running on it?
Hi @samansmink thanks for your help! So it took 41 seconds in delta vs 11 seconds on plain parquet, which is quite a difference. My guess is that it's related to the threads used to read the data from the remote storage (s3). I got bad performance over parquet as well, but once I increased my duckdb threads to 60 I got great performance (it just increased network throughput substantially). So I have to think that it's related to that somehow. I looked at the code and it seems it's using the the mutlifilereader as parquet uses, so it's interesting why it's slower in the reads. Or perhaps it's something else. See that even using the delta-rs lib I got better performance, so it must be related to the reader somehow.
I'm basically running query 2 from TPC-DS on top TPC-DS SF=100, one is stored in delta lake and the other in plain parquet.
Hi @samansmink Is this related to hive_partitioning ? read_parquet have hive_partitioning = true to skip some partitions. delta_scan has fixed false https://github.com/duckdb/duckdb_delta/blob/e0af7f6a7ba8bc121c00ec61ed442ce44f46a242/src/functions/delta_scan.cpp#L517
@mervynzhang no that should not be related the delta extension should not use the hive partitioning mechanism at all. I think this issue may be related to cardinality estimation which should be fixed in the next delta extension release
closing this issue, it should be fixed in nightlies now and with the upcoming DuckDB v1.1.0 release
Hi, just started using delta_scan (using the nightly-built extension) and I'm getting bad performance on top of remote files. I believe this is due to delta kernel not using all my threads. I'm setting my duckdb threads to 60. I'm getting a bit better performance (still not parquet level) with using the delta-rs lib:
Attaching profiles. Delta:
Parquet:
Using delta-rs: