Closed altmannmarcelo closed 2 days ago
I don't agree - We should give the users the ability to select which tables they want to snapshot and later add new tables while also allow for snapshot everythig. On your Idea, if I know that I won't use a table that has TB of data, why should we snapshot it first to later discard?
I agree we should avoid snapshotting a table if we'll only then discard it later. I suppose I'd want an easy way to specify the blacklist in advance.
The whole problem seems very similar to the --replication-tables
and --replication-tables-ignore
arguments, which are all about framing this as either a whitelist or a blacklist.
Ignoring the current particulars of how ReadySet does snapshotting, the blacklisting mentality makes a lot of sense to me (typically presume everything is replicated with a handful of exceptions, which are blacklisted).
Explicitly whitelisting the tables you want replicated could make sense in some situations too, but I don't expect that direction to be as common or desirable. It seems like another operational step customers must do whenever they add a new table to their application.
I'm not sure what I'm proposing yet, so I'm thinking out loud here, but maybe it would make sense to have a way to start ReadySet for the first time (without implicitly also starting replication), examine the upstream tables, define a blacklist that makes sense, and then tell ReadySet to start snapshotting/replicating.
After the initial setup, I'd expect we'd want ReadySet to continue snapshotting and replication by default on subsequent process launches.
We need to allow for both use cases:
After Readyset has started, I want to add a new table to either one of those lists - We should have a command to accomplish this. That is what this ticket is about.
Those 2 use cases make sense, and for the first one where we're only replicating a handpicked 10, I think running a command to add the table makes sense.
For the second use case, where we're replicating everything but a handpicked 10, I think it would be unfortunate if the user had to manually add this new table (after the first 990 were implicitly chosen for replication).
For the second use case, where we're replicating everything but a handpicked 10, I think it would be unfortunate if the user had to manually add this new table (after the first 990 were implicitly chosen for replication).
This will automatically be added to Readyset when replicators see the DDL for the new table and it does not match the --replicate-tables-ignore
. That is how the filtering works currently.
Description
In case someone is using replication filters to select which tables to snapshot, it's required to bounce the instance in order to add a new replicated table.
We should create a new command to allow for adding a new table to the replicated tables.
Change in user-visible behavior
Requires documentation change