datafuselabs / databend

𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
https://docs.databend.com
Other
7.31k stars 704 forks source link

feat: add query query_hash and query_parameterized_hash to query log #15524

Closed BohuTANG closed 2 weeks ago

BohuTANG commented 2 weeks ago

Summary

To categorize and examine similar queries in the query history, utilize a hash of the query text.

query_hash

The query_hash ensures that repeated queries with variations only in white space or comments share the same hash. For example:

SELECT * FROM t1 WHERE name = 'jim'
SELECT *  FROM t1 WHERE name  = 'jim'

Both have the same query_hash.

query_parameterized_hash

The query_parameterized_hash is used when literals are part of a comparison predicate using operators such as =, !=, >=, or <=. Despite different literals, these queries share the same hash:

SELECT * FROM t1 WHERE name = 'data'
SELECT * FROM t1 WHERE name = 'bend'

Ref: https://docs.snowflake.com/en/user-guide/query-hash

BohuTANG commented 2 weeks ago

Can we get the two hash during the SQL parsing phase? @sundy-li

sundy-li commented 2 weeks ago

see:

let key = gen_result_cache_key(self.formatted_ast.as_ref().unwrap());