apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.02k stars 1.14k forks source link

Graphviz PhysicalPlan #219

Open tustvold opened 3 years ago

tustvold commented 3 years ago

The ability to get a graphviz render of the LogicalPlans is awesome, it would be even more awesome to have a similar facility for PhysicalPlans.

The way I'd envisage this working is with a new method added to the ExecutionPlan trait

fn fmt_for_explain(&self, f: &mut fmt::Formatter) -> fmt::Result;

This is copied wholesale from UserDefinedLogicalNode and would function in much the same way. This would have a default placeholder implementation to avoid this being a breaking change.

A public free function would then be added to visit each node in a provided ExecutionPlan tree in turn and use this fmt_for_explain along with the existing schema function to produce a graphviz representation. (This could also be a trait function if preferred).

FYI @alamb

alamb commented 3 years ago

This also might relate to https://github.com/apache/arrow-datafusion/issues/96 (to add graphviz output somehow to explain plan)

It would probably also be nice to improve the default physical plan explain plan so it didn't try and print out all low level details too.

For example, an explain verbose today is both verbose and hard to read.

Query execution complete in 700.985811ms
844910ece80be8bc_apps> explain verbose select redis_version, used_memory_peak from redis limit 10;
+-----------------------------------------+----------------------------------------------------------------------------------------------------------------------------+
| plan_type                               | plan                                                                                                                       |
+-----------------------------------------+----------------------------------------------------------------------------------------------------------------------------+
| logical_plan                            | Limit: 10                                                                                                                  |
|                                         |   Projection: #redis_version, #used_memory_peak                                                                            |
|                                         |     TableScan: redis projection=None                                                                                       |
| logical_plan after projection_push_down | Limit: 10                                                                                                                  |
|                                         |   Projection: #redis_version, #used_memory_peak                                                                            |
|                                         |     TableScan: redis projection=Some([67, 99])                                                                             |
| logical_plan after projection_push_down | Limit: 10                                                                                                                  |
|                                         |   Projection: #redis_version, #used_memory_peak                                                                            |
|                                         |     TableScan: redis projection=Some([67, 99])                                                                             |
| physical_plan                           | GlobalLimitExec {                                                                                                          |
|                                         |     input: LocalLimitExec {                                                                                                |
|                                         |         input: ProjectionExec {                                                                                            |
|                                         |             expr: [                                                                                                        |
|                                         |                 (                                                                                                          |
|                                         |                     Column {                                                                                               |
andygrove commented 1 year ago

Ballista produces DOT from physical plans already, and much of this code could be repurposed in DataFusion.

https://github.com/apache/arrow-ballista/blob/master/ballista/rust/scheduler/src/state/execution_graph_dot.rs