Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale read throughput. Under the hood, ReadySet caches the results of cached select statements and incrementally updates these results over time as the underlying data changes.
When testing readyset with benchbase's tpc-h benchmark, several queries contain a join condition, expressed in the where clause that causes a peculiar error when at least one of the columns is wrapped with a function.
Here is the query
create cache from
SELECT c_id,
c_last,
sum(ol_amount) AS revenue,
c_city,
c_phone,
n_name
FROM customer,
oorder,
order_line,
nation
WHERE c_id = o_c_id
AND c_w_id = o_w_id
AND c_d_id = o_d_id
AND ol_w_id = o_w_id
AND ol_d_id = o_d_id
AND ol_o_id = o_id
AND o_entry_d >= '2007-01-02 00:00:00.000000'
AND o_entry_d <= ol_delivery_d
AND n_nationkey = ascii(substring(c_state from 1 for 1))
GROUP BY c_id,
c_last,
c_city,
c_phone,
n_name
ORDER BY revenue DESC;
Here is the error, as seen in the mysql client (carriage returns added by me):
ERROR 1105 (HY000): Error during RPC (extend_recipe (in readyset-client/src/controller/rpc.rs:49:26)):
SQL SELECT query 'q_6565248d6c10a8b' couldn't be added: MIR node '18' couldn't be lowered to dataflow:
Internal error: in readyset-server/src/controller/mir_to_flow.rs:45:5:
invariant failed: columns.len() == names.len();
The problem occurs because of this where predicate (which is a join condition):
AND n_nationkey = ascii(substring(c_state from 1 for 1))
Additional data points:
I tried several different functions (ascii, length, …) and they all exhibited the same error.
I changed the join clause to a naive literal (AND ascii(substring(c_state from 1 for 1)) ='f'), and it successfully migrated.
I created a variant where the two functions are over the same table's columns (where ascii(first_name) = ascii(last_name)), and this successfully migrated as well.
I moved the join condition, with the function, into the from clause, and it failed to migrate.
Thus, i think it is because at least one of the join condition's columns is wrapped in a function.
Steps to reproduce
create table user (id int, dept_name varchar(32));
create table dept (dept_id int, name varchar(32));
create cache from select user.id, dept.dept_id
from user, dept
where ascii(user.dept_name) = dept.name;
ReadySet version
git sha b647d97e3f1a0dfbe47168d0814a67f003066bb6
cli command
cargo --locked run --bin readyset -- \
--upstream-db-url=mysql://root:noria@127.0.0.1:3306/benchbase --feature-post-lookup
Description
When testing readyset with benchbase's tpc-h benchmark, several queries contain a join condition, expressed in the
where
clause that causes a peculiar error when at least one of the columns is wrapped with a function.Here is the query
Here is the error, as seen in the mysql client (carriage returns added by me):
The problem occurs because of this
where
predicate (which is a join condition):Additional data points:
ascii
,length
, …) and they all exhibited the same error.AND ascii(substring(c_state from 1 for 1)) ='f'
), and it successfully migrated.where ascii(first_name) = ascii(last_name)
), and this successfully migrated as well.from
clause, and it failed to migrate.Thus, i think it is because at least one of the join condition's columns is wrapped in a function.
Steps to reproduce
ReadySet version
git sha
b647d97e3f1a0dfbe47168d0814a67f003066bb6
cli command