cmu-db / optd

CMU-DB's Cascades optimizer framework
https://cmu-db.github.io/optd/
MIT License
327 stars 19 forks source link

Decouple datafusion's logical optimization/conversion #45

Open jurplel opened 5 months ago

jurplel commented 5 months ago

Remove:

        let batches = df.collect().await?;

from datafusion-optd-cli/src/exec.rs

because it will internally run datafusion's logical optimizer. We should try to call the other collect function instead, after running our own optimizer and converting the optd plan into datafusion's (impl ExecutionPlan). This will allow us to run our end-to-end optimized query on datafusion.

yliang412 commented 5 months ago

I'm thinking about waiting on this: https://github.com/apache/arrow-datafusion/pull/9063. With this, we can place our optimizer in session state instead.

yliang412 commented 5 months ago