sysown / proxysql

High-performance MySQL proxy with a GPL license.
http://www.proxysql.com
GNU General Public License v3.0
6.01k stars 978 forks source link

UTC support for pt-heartbeat with replication monitoring in ProxySQL #1691

Open olafz opened 6 years ago

olafz commented 6 years ago

I've got several MySQL servers in different timezones; in this example I'll use a slave in CEST (Europe). To monitor replication lag we recently switched from "normal" replication monitoring to Percona Toolkit heartbeat (pt-heartbeat). To make it work across timezones reliably, we use pt-heartbeat with the --utc flag: timestamps are now in UTC. The local clocks of all servers are set to the local time(s).

Now, proxysql uses the following query to determine the replication lag:

SELECT MIN(ROUND(TIMESTAMPDIFF(MICROSECOND, ts, SYSDATE(6))/1000000)) \
  AS Seconds_Behind_Master FROM percona.heartbeat;

This result in the above example in

+-----------------------+
| Seconds_Behind_Master |
+-----------------------+
|                  7200 |
+-----------------------+

Even when there is no replication lag. This is actually caused by the timezone UTC vs CEST; the real replication lag is in the millisecond range:

mysql>  SELECT ts, SYSDATE(6) from percona.heartbeat;
+----------------------------+----------------------------+
| ts                         | SYSDATE(6)                 |
+----------------------------+----------------------------+
| 2018-09-15T08:10:20.350660 | 2018-09-15 10:10:20.402185 |
+----------------------------+----------------------------+

Of course, this is entirely caused by the setup of different timezones on all servers. Unfortunately, this is not something that can be changed easily. I'm not entirely sure if this is something that needs to be fixed in ProxySQL, but it can be:

I now add the difference of the NOW() and the current UTC_TIMESTAMP to the outcome above:

mysql> SELECT TIMESTAMPDIFF(SECOND, NOW(), UTC_TIMESTAMP);
+---------------------------------------------+
| TIMESTAMPDIFF(SECOND, NOW(), UTC_TIMESTAMP) |
+---------------------------------------------+
|                                       -7200 |
+---------------------------------------------+

This can also be easily added to ProxySQL. But the problem then is that you can't detect if pt-heartbeat was started with the --utc option. That would need to be another option you can set in ProxySQL; so not only a mysql-monitor_replication_lag_use_percona_heartbeat, but an additional mysql-monitor_replication_lag_percona_heartbeat_use_utc or something like that.

renecannao commented 6 years ago

Hi @olafz .

Of course, this is entirely caused by the setup of different timezones on all servers.

Yes, you are asking for trouble here :smile: I have seen several cases of data drift between master and slaves as a result of different timezones.

With regards to the specific replication lag check, I think the easiest workarond is to tune mysql_servers.max_replication_lag accordingly. For example, if you want a max replication lag of 120 seconds and your slave is running in CEST, configure mysql_servers.max_replication_lag=7320. Of course, this workaround is far from being perfect, for example you will have issues when TZ is switching from CEST to CET.

That said, I would really recommend to not mix timezones.