readysettech / readyset

Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale read throughput. Under the hood, ReadySet caches the results of cached select statements and incrementally updates these results over time as the underlying data changes.
https://readyset.io
Other
4.53k stars 125 forks source link

What to do if the query is out of the cache range? #21

Open xnge opened 2 years ago

xnge commented 2 years ago

Is your feature request related to a problem? Please describe. What to do if the query is out of the cache range? readiest is a very interesting project, as I understand it, Readyset is caching the data from the backend database(mysql, Postgres), but I have a question:

  1. Is the cache all the data of the database? (I understand that some data should be cached)
  2. if the user's query exceeds the range of cached data, how to deal with it? Forward the user's SQL to the back-end database(mysql, Postgres) for processing?
jseldess commented 2 years ago

Thanks for your interest in ReadySet, @xnge, and for posting this question!

Have you seen the SQL Support page in our docs? If not, take a look at the summary at the top. It should help address your questions. But real quick:

Is the cache all the data of the database? (I understand that some data should be cached)

ReadySet creates a copy of the database it is connected to, but it doesn't automatically cache any data, by default. When you have a query that you want to cache, you use the custom CREATE CACHE command to do that.

if the user's query exceeds the range of cached data, how to deal with it? Forward the user's SQL to the back-end database(mysql, Postgres) for processing?

Yes, that's right. For any queries that aren't cached, ReadySet proxies the query to the upstream database.

alanamarzoev commented 2 years ago

Hey! Adding a bit more color to Jesse's response:

No, ReadySet doesn't cache all of the data in the database by default. When configuring ReadySet during the initial set up process, you can specify which base tables in your primary DB are replicated to ReadySet. ReadySet can then be used to cache queries that rely on those tables. By default, ReadySet replicates all tables. If you run a query against ReadySet that refers to a table that wasn't replicated, it will be proxied to the upstream.

Once the tables are replicated, you can use the CREATE CACHE commands to specify which queries should be cached. You're specifying queries at the query structure level– so if you have a query SELECT col FROM table WHERE id = ? rather than SELECT col FROM table WHERE id = 5. By default, the cache for any new query will be empty (you can think of this as the cache being cold). As you start to issue queries over time with specific parameters filled in (e.g. SELECT col FROM table WHERE id = 5), ReadySet will dynamically populate the results in the cache for the query with that specific parameter. It has its own internal query engine, so it's able to do so without proxying the query to your upstream database.

If, on the other hand, you send ReadySet a query that you haven't yet created a cache for by running the CREATE CACHE command, then ReadySet will proxy it to your database.

Hope that makes sense! Let me know if you have any questions :)

xnge commented 2 years ago

hi, thanks you for your reply. :-)