Open msirek opened 2 years ago
We should check how common it is for real world workloads to not use prepared statements, if our telemetry can tell us that. AFAIK many ORMs perform prepared statements automatically, as it's the best way to prevent against SQL injection.
Is your feature request related to a problem? Please describe. Significant optimization time is taken to run simple non-parameterized primary key SELECT statements. A simple experiment running 10,000 PK lookups with PREPARE and EXECUTE vs. equivalent non-parameterized SELECTs shows approximately 25% less CPU time and 45% less wall-clock time (as measured via the time command):
There is a lot of variability in the results from run to run, but parameterized queries are consistently faster.
Describe the solution you'd like When a non-parameterized primary key lookup query is planned, create a parameterized version of it and place it in the statement cache. On future submissions of PK lookup queries, check if the AST is in the form of a simple PK SELECT, and if so, check if an equivalent parameterized query has been cached. If found in cache, run the statement as an EXECUTE of the cached prepared statement.
Describe alternatives you've considered Build a new memo caching mechanism. If the AST is a simple PK SELECT, build a fingerprint of the query involving the table ID, columns selected, and predicate columns. Cache a pruned memo for the query which only includes a PK-constrained
ScanExpr
using a new statement caching mechanism that matches on query signature ignoring constant values in the query predicates. Provide an efficient mechanism to share most of the memo between requests, except for the scanConstraint
which is efficiently rebuilt for each matched request.Additional context In the follow profile of non-parameterized primary key lookups, the goal is to eliminate a lot of the overhead in
execbuilder.(*Builder).Build
,optbuilder.(*Builder).BuildWhere
,xform.(*explorer).exploreSelect
, andxform.(*Optimizer).optimizeGroupMember
, plus a few other paths. We may not be able to avoid the overhead ofoptbuilder.(*Builder).resolveDataSource
because we need to know if the table has changed since the query was optimized.Jira issue: CRDB-21514