Open mattlord opened 2 years ago
I was thinking about how to determine what shards to actually use in these cases and it's not so clear how we could correctly do this in various scenarios.
BUT, you can now specify what shards to operate on in various workflow commands. So on main/v21, building on the same basic test case:
git checkout main && make build
cd examples/local
alias vtctldclient='command vtctldclient --server=localhost:15999'
./101_initial_cluster.sh; mysql < ../common/insert_commerce_data.sql; ./201_customer_tablets.sh; ./202_move_tables.sh
CELL=zone1 TABLET_UID=300 ../common/scripts/mysqlctl-up.sh
SHARD=-80 CELL=zone1 KEYSPACE=customer TABLET_UID=300 ../common/scripts/vttablet-up.sh
vtctldclient MoveTables --workflow commerce2customer --target-keyspace customer show
vtctldclient MoveTables --workflow commerce2customer --target-keyspace customer --shards 0 show
The first show
command returns nothing. The second one returns the expected output.
I'm thinking that this is a good solution here as the user can specific which of the serving shards you care about. What do you think @timvaillancourt and @arthurschreiber ?
The first
show
command returns nothing. The second one returns the expected output.
@mattlord I think the ability to specify shards is useful but if I understand everything correctly, the 1st show
returning nothing feels potentially confusing to the user
The first
show
command returns nothing. The second one returns the expected output.@mattlord I think the ability to specify shards is useful but if I understand everything correctly, the 1st
show
returning nothing feels potentially confusing to the user
Yeah, that's a general issue today in vtctldclient GetWorkflows
and vtctldclient <wf_type> show
etc. It has nothing specifically to do with this discussion -- and it's something that I'd like to improve (how we handle cases where there are no matching workflow(s) returned).
Overview of the Issue
VReplication workflows are orchestrated or driven by the primary tablets in the target keyspace. When you create a new vreplication workflow a record is inserted into the
_vt.vreplication
table on the primary tablets in the target keyspace and each target tablet then orchestrates things from there by first selecting a source tablet for its vstream. These records are then queried as you monitor the state and the records are updated as the workflow progresses and its state changes.A side effect of these implementation details is that when you e.g. issue a
vtctlclient -server=<server> MyTargetKeyspace.MyWorkflowName Show
commandvtctl
first finds the PRIMARY tablets for each shard inTargetKeyspace
and executes this SQL query against them to get the status of any relevant vreplication streams (vt_<keyspace>
just being the default DB name that can be overridden with-init_db_name_override
):You can see this code here: https://github.com/vitessio/vitess/blob/release-12.0/go/vt/vtctl/workflow/traffic_switcher.go#L192-L216
You can run into problems then when e.g. you have an active
MoveTables
workflow running but during that process you realized you would need toReshard
the target keyspace, so you begin preparing the new shards ahead of time. When you get into this state you are forced to runInitShardPrimary
on these new shards in the target keyspace even though you may not generally want them serving or otherwise available yet as w/o doing this you cannot execute any furthervtctl
vreplication workflow commands to monitor the state, complete, revert, or delete the existing workflow(s) in the keyspace.Reproduction Steps
Using the docker_local container:
You will see that the final command produces an error:
Binary version
Example: