kuzudb / kuzu

Embeddable property graph database management system built for query speed and scalability. Implements Cypher.
https://kuzudb.com/
MIT License
1.28k stars 90 forks source link

Feature: Operator Printing #3664

Open andyfengHKU opened 3 months ago

andyfengHKU commented 3 months ago

Description

As we approach the benchmark stage, plan printing is being used much more frequent. We need to make the following change to plan printing.

Stage 1: Print static information

The first step is to print operator static information correctly and thoroughly, e.g. what table is being scanned and what columns are being scanned ...

A complete list is as follow

Stage 2 Rendering

Rendering plan on shell is tricky when the plan becomes big. Plus it's not sufficient to just render it on the shell. We need a mechanism to render big plan on the web and in the explorer too. I don't have concrete road map for stage2 so @mewim should edit this part. One thing I'm fairly certain is that we need to first print plan to json format

Stage 3 Print logical plan

Since we have printed physical plan already, there is nothing prevent us from printing logical plan either.

We want the logical plan to print with a cypher command Explain Logical. It will process and print similar to the plan printer for the physical plan so we can adapt this code to print logical plans as well. The logical operators will need print info structs to handle the list of information below.

As well, since both physical operators and logical operators will have their own printing structs, we no longer need getExpressionsForPrinting().

A complete list for logical plan printing is as follows:

Stage 4 Print statistics and cardinality

Stage 5 Advanced statistics printing

I haven't decided if we should go this far. But printing disk IO for scan operators make sense to me.

ray6080 commented 1 week ago

Not sure when we can get to stage2 and be able to visualize the plan in a web page. Alternatively, we can provide a more succinct way of printing the plan. One example is what Postgres does here, so it should work better in more cases, though readability decreases a lot.