Open sougou opened 6 years ago
I personally like option 3. We are already in a case where vttablet needs to have a high level of privilege to be able to do anything. Perhaps it can gracefully degrade for those folks who want to run it with less permissions.
One thing to note here is that these are both able to be set as session variables as well. I think it's important to support that to some degree, if possible. Because at HubSpot the people setting these timeouts have no access to the underlying processes. They would set it from their app, possibly on a per-app basis.
It seems to me that for vttablet pooled connections, there's no good reason for mysql to close the connection for any reason, so 3 makes the most sense to me assuming that what we would end up doing is setting the mysql settings for idle timeout to something super high to effectively stop it from kicking in.
As Simon suggested we could do this asynchronously from the idle connection sweeper if that makes more sense.
I get the impression that something like doing a mysql ping on an idle connection would keep it active (I think that works) which would avoid unexpected connections dropping. It doesn't look like the underlying issue of the client and server not agreeing on timeout values will be fixed by the upstream protocol so we just need to make sure that we avoid this happening. If the connection is idle sending out periodic "idle ping" should be harmless.
A further thought as this came up again today. If I do this on my mysql prompt I get:
root@127.0.0.1 [(none)]> show global variables like '%timeout%';
+-----------------------------------+----------+
| Variable_name | Value |
+-----------------------------------+----------+
| connect_timeout | 10 |
...
| interactive_timeout | 28800 |
...
| net_read_timeout | 30 |
| net_write_timeout | 60 |
...
| wait_timeout | 28800 |
+-----------------------------------+----------+
20 rows in set (0.00 sec)
You'll get something back from vitess if you send a similar query. I wonder if extra vitess variables should be shown or the actual values used should be those used by the vtgate process as in theory knowing what the backend mysqld that you may end up talking to may be hard to do especially if the query gets routed to a replica
or rdonly
tablet as there may be many of them.
I can't remember now but think there's some sort of "interactive flag" passed in the connection. Does Vitess support/recognise this? (maybe not relevant).
For the other settings indicating the session/global values might be interesting by providing those used by vtgate (also does vtgate allow me to change these and respect my session level settings?)
Just a thought.
Ran across this issue today. Adjusted our wait_timeout up from 6 to 60 to match the Vitess idle timeout and saw an immediate reduction in connections/sec. I would also support option 3 in this scenario, set a session variable for wait_timeout to a configured value (maybe >= the Vitess idle timeout?).
I should note that a reading of the code (and some experimentation) reveals that simply having the vttablet idle timeouts < MySQL wait_timeout is not sufficient. The issue is that the current implementation scans the connection pools for idle connections to evict every idle_timeout/10 seconds. As a result, if the vttablet idle timeouts are near or above 90% of the MySQL wait_timeout timeout, you are still likely to see (some) errors.
MySQL has two timeouts that are relevant to vttablet: wait_timeout and net_write_timeout. On the vttablet side, the idle timeout corresponds to the wait_timeout, and there is none for net_write_timeout.
There are a few options: