apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.33k stars 1.2k forks source link

Blog post: How DataFusion became the fastest engine for querying parquet (according to Clickbench) #13436

Closed alamb closed 17 hours ago

alamb commented 6 days ago

Is your feature request related to a problem or challenge?

@pmcgleenon ran ClickBench and it shows that DataFusion is super fast at querying parquet files (faster than duckdb and clickbench/chDB on the same hardware). See

This was possible due to the hard work of many many people, as partly discussed on

Describe the solution you'd like

I want to tell the world about this -- not only to highlight how great DataFusion is (which it is!) but also to highlight all the work and people required for such a thing.

Describe alternatives you've considered

No response

Additional context

No response

alamb commented 1 day ago

Draft is up: https://github.com/apache/datafusion-site/pull/33