apache / datafusion-ballista

Apache DataFusion Ballista Distributed Query Engine
https://datafusion.apache.org/ballista
Apache License 2.0
1.46k stars 185 forks source link

Only decode plan in `LaunchMultiTaskParams` once #743

Closed Dandandan closed 1 year ago

Dandandan commented 1 year ago

Which issue does this PR close?

Closes #742

Rationale for this change

When starting many tasks on single executors, the execution plan is decoded for every task/partition. This can take quite a bit of time for larger plans, delaying execution and consumes quite some memory as well for executors that start a lot of tasks for the same query stage at the same time.

What changes are included in this PR?

Decode the plan only once and share the Arc<dyn ExecutionPlan>.

Are there any user-facing changes?