Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale read throughput. Under the hood, ReadySet caches the results of cached select statements and incrementally updates these results over time as the underlying data changes.
Sport this while testing perf of hash joins for straddle joins. If we have a cache and then restart readyset, when the migration is happening and we keep running the same query, it returns wrong results.
for f in $(seq 1 20000)
do
mysql -e "INSERT INTO tb1 VALUES (${f}, '10', now())" test
mysql -e "INSERT INTO tb2 VALUES (10, ${f}, '${f}', now())" test
mysql -e "INSERT INTO tb2 VALUES (10, ${f}, '${f}', now())" test
done
Readyset (requires you to start with --enable-experimental-straddled-joins):
CREATE CACHE FROM SELECT COUNT(tb1.ID) FROM tb1 JOIN tb2 ON tb2.id1 = tb1.ID WHERE tb1.name = '10' AND tb2.ID = 10;
SELECT COUNT(tb1.ID) FROM tb1 JOIN tb2 ON tb2.id1 = tb1.ID WHERE tb1.name = '10' AND tb2.ID = 10;
mysql> SELECT COUNT(tb1.ID) FROM tb1 JOIN tb2 ON tb2.id1 = tb1.ID WHERE tb1.name = '10' AND tb2.ID = 10;
+--------------------------+
| count(`test`.`tb1`.`ID`) |
+--------------------------+
| 20000 |
+--------------------------+
1 row in set (0,05 sec)
Restart RS and execute the query multiple times:
Note: Result with column name COUNT(tb1.ID) are from upstream, results with `count(`test`.`tb1`.`ID`)` are from readyset
mysql> SELECT COUNT(tb1.ID) FROM tb1 JOIN tb2 ON tb2.id1 = tb1.ID WHERE tb1.name = '10' AND tb2.ID = 10;
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id: 8
Current database: *** NONE ***
+---------------+
| COUNT(tb1.ID) |
+---------------+
| 20000 |
+---------------+
1 row in set (0,03 sec)
mysql>
mysql>
mysql> SELECT COUNT(tb1.ID) FROM tb1 JOIN tb2 ON tb2.id1 = tb1.ID WHERE tb1.name = '10' AND tb2.ID = 10;
+---------------+
| COUNT(tb1.ID) |
+---------------+
| 20000 |
+---------------+
1 row in set (0,02 sec)
mysql> SELECT COUNT(tb1.ID) FROM tb1 JOIN tb2 ON tb2.id1 = tb1.ID WHERE tb1.name = '10' AND tb2.ID = 10;
+---------------+
| COUNT(tb1.ID) |
+---------------+
| 20000 |
+---------------+
1 row in set (0,02 sec)
mysql> SELECT COUNT(tb1.ID) FROM tb1 JOIN tb2 ON tb2.id1 = tb1.ID WHERE tb1.name = '10' AND tb2.ID = 10;
+---------------+
| COUNT(tb1.ID) |
+---------------+
| 20000 |
+---------------+
1 row in set (0,02 sec)
mysql> SELECT COUNT(tb1.ID) FROM tb1 JOIN tb2 ON tb2.id1 = tb1.ID WHERE tb1.name = '10' AND tb2.ID = 10;
+--------------------------+
| count(`test`.`tb1`.`ID`) |
+--------------------------+
| 0 |
+--------------------------+
1 row in set (3,88 sec)
Description
Sport this while testing perf of hash joins for straddle joins. If we have a cache and then restart readyset, when the migration is happening and we keep running the same query, it returns wrong results.
Readyset (requires you to start with
--enable-experimental-straddled-joins
):Restart RS and execute the query multiple times:
Note: Result with column name
COUNT(tb1.ID)
are from upstream, results with `count(`test`.`tb1`.`ID`)` are from readysetChange in user-visible behavior
Requires documentation change