vitessio / vitess

Vitess is a database clustering system for horizontal scaling of MySQL.
http://vitess.io
Apache License 2.0
18.48k stars 2.09k forks source link

change default for binlog_player_tablet_type to REPLICA,RDONLY and let it prioritize REPLICA over RDONLY #3252

Closed michael-berlin closed 1 year ago

michael-berlin commented 7 years ago

In the past, we always assumed that filtered replication would only pick REPLICA tablets as source.

The idea here is that REPLICAs are somewhat guaranteed to lag less behind and therefore filtered replication also has less delay in applying all changes.

But recently we added the flag --binlog_player_tablet_type which allows to pick the source. The default value is REPLICA. That's confusing because vtworker SplitClone requires RDONLY tablets. Therefore, people with an existing MySQL setup turn up a vttablet RDONLY instance and no REPLICA ones.

Like the title said, we should:

@alainjobart What do you think about this?

@sjmudd as FYI

michael-berlin commented 7 years ago

More feedback from @sjmudd:

Also making it clearer that the binlog streamer is waiting for a tablet of type XXX (replica in my case) would be good as it would indicate why things are stalled.

Sounds like we should a) expose that on the /debug/status page and b) also log an error if we don't find a tablet.

sjmudd commented 7 years ago

The status page showed: Binlog player state: Running (picking source tablet) which I didn't really understand. Perhaps it should say something like ...(waiting for one of [rdonly,replica] tablet to become available to read from)?

alainjobart commented 7 years ago

I agree with the proposed fixes, with minor tweaks:

  1. change default to replica,rdonly
  2. use the order of the parameter to chose: replica,rdonly would favor replica, rdonly,replica would favor rdonly.
  3. change the status message to @sjmudd proposed version.

I can look at this in the newt few days, but feel free to post here and work on it if you get there first.

tirsen commented 6 years ago

I've tried this but it's not going to work with split diffs.

The reason is that during a split diff the vtworker will pick a rdonly tablet and stop replication there, then wait for the binlog player in the destination master to catch up to the same GTID. But if the master in the destination shard uses the same rdonly tablet as the source of its filtered replication it will never catch up! This actually causes the master to hang in an infinite loop. I've experienced this myself many times before I figured it out! :-)

The way out of this I think is that the vtworker should never pick the rdonly tablet that destination master uses as a source for the binlog player. And vice versa, the binlog player should never start on a tablet used by the vtworker, but I think that is fixed by the virtue of vtworker marking it as drained.

alainjobart commented 6 years ago

Haha good catch @tirsen, we didn't think about this enough. :)

The size of the binlog streamer map can be used as a check at the right place to figure out if a tablet can be used by vtworker.

As a side note that might be of interest here, we've been thinking about the following change for a long time to make the system more robust. Right now, vtworker first changes the state of the tablets it takes out of service to 'worker' in an RPC, then works with them, then changes them back to their original state with another RPC. If vtworker gets stuck, dies, or anything similar, the tablets are stuck in 'worker' mode. Instead, we wanted to start a (new) streaming RPC from vtworker to vttablet, and have vttablet change state when the streaming RPC is received, and if the RPC is interrupted, go back to its original state.

I'm mentioning this because if we do that, we can have vttablet check for the size of the binlog player map and refuse the state change if it's serving binlogs.

sjmudd commented 6 years ago

One comment here. My original confusion when this came up was that I didn't understand the error message properly on the destination, or was looking in the wrong place. (I didn't record the error message unfortunately.)

So if you do not or can not make changes then please ensure the reporting of the "issue" is as clear as possible both in the logging on the destination master vttablet logging and also in the vttablet web interface. I guess also it may be wise in the documentation of the process to add some comments about why you need X replica/rdonly tablets (by default) and how to work around having a smaller number. That context might have helped me understand why things weren't working as expected.

While I'm used to managing quite a large number of MySQL servers they tend to be managed individually and there are few dependencies between one and another. I see from Vitess that some processes are quite tightly coupled to working a certain way which I'm sure YouTube feel is completely normal but it may be different to expectations of newbies in this area. So pointing out the user is doing it wrong and why as clearly as possible helps him/her to make progress.

tirsen commented 6 years ago

@alainjobart Refusing the switch to DRAINED (which is what I think vtworker changes the tablet to when it works with them) while it is serving binlogs sounds like a good idea. It's not drained after all, it's in use and can't safely be used by the vtworker.

@sjmudd I found that a centralized logging service helps a lot with Vitess. I love papertrail but at Square we have an internal one developed in house.

mattlord commented 1 year ago

I'm closing this as the default tablet type at the tablet level was changed to in_order:REPLICA,PRIMARY some time ago and is the default now in all supported versions: https://vitess.io/docs/reference/vreplication/flags/#vreplication_tablet_type

And it's the default for common workflow types as well: https://vitess.io/docs/reference/vreplication/movetables/#--tablet_types

If I'm missing or misunderstanding something please let me know and we can reopen this.