vitessio / vitess

Vitess is a database clustering system for horizontal scaling of MySQL.
http://vitess.io
Apache License 2.0
18.7k stars 2.1k forks source link

Bug Report: VReplication unable to complete/terminate on MySQL error `1180` (`ER_ERROR_DURING_COMMIT`) #17248

Closed shlomi-noach closed 2 days ago

shlomi-noach commented 3 days ago

Overview of the Issue

If MySQL returns error code 1180 (ER_ERROR_DURING_COMMIT), the binlog connection auto-closes:

https://github.com/vitessio/vitess/blob/216fd70be49fa14ddd22ea97d26a9434770c0ca2/go/vt/binlog/binlogplayer/dbclient.go#L108-L111

And it is then subsequently unable to setMessage in:

https://github.com/vitessio/vitess/blob/216fd70be49fa14ddd22ea97d26a9434770c0ca2/go/vt/vttablet/tabletmanager/vreplication/vreplicator.go#L189-L192

because it uses that same (now closed) connection.

This then means vreplication does not persist the failure, and keeps on retrying the same step infinitely (or until the MySQL response is different from 1180).

Reproduction Steps

Return error code 1180 from MySQL during vreplication. This is a rare code which I've only encountered during some internal testing.

Binary Version

all

Operating System and Environment details

any

Log Fragments

No response