Closed scbizu closed 2 months ago
Hey @scbizu! I'm here to help you with any bugs, questions, or contributions you have for databend. Let's work together to solve this issue!
I found a similar closed issue: Support Output MySQL format, which was last updated on May 30, 2024. This issue was referenced by another issue: tracing issues about OutputFormat & result compare of tests, also last updated on May 30, 2024. The suggested solution was to refactor DFQueryResultWriter
into output MySQL format like Tsv/Csv/Parquet
[1].
Regarding the specific behavior of Databend when using LIMIT 0
, Databend performs a full table data scan instead of only extracting metadata. This is because its query execution plan includes a TableScan
operation that reads rows from the table, even with a LIMIT
clause. This behavior is different from MySQL, which can optimize such queries to only extract metadata without scanning the table data [2][3][4].
Additionally, the source code for the Limit
physical plan in Databend shows that the Limit
plan is built on top of the input plan, which includes the TableScan
operation. The Limit
plan does not inherently skip the data scan but rather applies the limit after the data has been read. This design choice ensures that the query execution adheres to the specified limits but does not avoid the initial data scan [4].
This behavior might impact performance, especially for client-drivers and SQL frameworks that use this feature to check SQL validation and extract metadata.
To continue talking to Dosu, mention @dosu.
Maybe you need ignore result?
SELECT number FROM numbers(2) IGNORE_RESULT;
https://docs.databend.com/sql/sql-commands/query-syntax/query-select#ignore_result
This is a optimizer issue? cc @xudong963
Maybe you need ignore result?
SELECT number FROM numbers(2) IGNORE_RESULT;
https://docs.databend.com/sql/sql-commands/query-syntax/query-select#ignore_result
No, this still scan full tables but ignore the result.
Join is blocking operator, so limit return early will not be reached.
limit 0
should be optimized it's input to be empty result scan.
Join is blocking operator, so limit return early will not be reached.
limit 0
should be optimized it's input to be empty result scan.
Yes.
Search before asking
Version
v1.2.629-nightly
What's Wrong?
As MySQL documented:
In databend , when set LIMIT 0 (or LIMIT 0,0) , the EXPLAIN (I paste it in Slack) said databend still performs full table data scan instead of only extracting the metadata . Some of client-drivers and SQL frameworks use this feature to check SQL validation and do extract the metadata to the ORM-based structure . With too much data scanning , the performance will step down in this case.
How to Reproduce?
Are you willing to submit PR?