h2database / h2database

H2 is an embeddable RDBMS written in Java.
https://h2database.com
Other
4.23k stars 1.2k forks source link

Add support for tuning TCP settings via jdbc url #2032

Open mattrpav opened 5 years ago

mattrpav commented 5 years ago

Currently, H2 uses a global connection timeout parameter and an infinite socket timeout for sending and receiving data over TCP. In general, distributed applications benefit from being able to have those settings configurable in order to avert a deadlock condition where a connecting thread never returns. Having H2 handle connect and receive timeouts eliminates the need for applications to have to wrap that handling in order to detect deadlock conditions.

Nearly every RDBMS has support for being able to configure these settings, and H2 users would benefit from being able to have these available.

Use cases for adjusting TCP Clients settings may include:

UC-1. Applications connecting to H2 servers in different cloud zones/data centers UC-2. H2 replication UC-3. Users using the H2 web console UC-4. Users using general-purpose SQL tools for connecting and performing ad-hoc queries UC-5. Monitoring and management tools collecting metric information

Proposed feature enhancement:

F-1. Add support to connection url strings to specify connect and socket timeout values (either receiveTimeout, sendTimeout or a soTimeout that covers both) F-2. Add support to enable/disable TCP keepAlive F-3. Add support to tune the TCP send/recv buffer size

Behavior:

B-1. H2 would throw an exception if the connect timeout is exceeded when AUTO_RECONNECT is not enabled B-2. H2 would go through a re-connect cycle if connect and/or soTimeout is reached when AUTO_RECONNECT is enabled B-3. Server-side H2 would rollback any transactions that have TCP connections abandoned.

Proposed changes:

Config Param Type Units Default Scope Status
-Dh2.socketConnectTimeout SysProperty ms 2000 Global Current
-Dh2.socketConnectRetry SysProperty count 16 Global Current
TCP_CONNECT_TIMEOUT URL param ms 10000 Per-connection Proposed
TCP_CONNECT_RETRY URL param ms -1 (infinite) Per-connection Proposed
TCP_SOCKET_TIMEOUT URL param ms infinite Per-connection Proposed
TCP_KEEP_ALIVE SysProperty boolean true Global Proposed

Proposed Tasks:

katzyn commented 5 years ago

Please, take a look on Connection.abort(), Connection.setNetworkTimeout(), and Connection.getNetworkTimeout() instead. These methods are currently not implemented by H2.

mattrpav commented 5 years ago

*Connection => JdbcConnection.java

Note: jdbc.Connection networkTimeout =~ TCP connect timeout

Note: If I'm following the class tree correctly, it appears the default connection timeout is set to 2s (2000 ms), not infinite via SysProperties.java and util/NetUtils.java.

katzyn commented 5 years ago

H2 does not touch socket options and this behavior should be preserved unless application uses these methods.

Implementation of Connection.setNetworkTimeout() may call Socket.setSoTimeout() (when a network connection is used) like some other JDBC drivers do.

mattrpav commented 5 years ago

@katzyn agreed, these settings should only apply for tcp:// connections

I agree, the JavaDocs indicate that jdbc.Connection.setNetworkTimeout should be for request information (aka Socket.setSoTimeout) and not connect timeout.

How do you feel about these as url parameters?

TCP_KEEP_ALIVE -> NetUtils.. Socket.setKeepAlive(...) TCP_CONNECT_TIMEOUT (ms) => NetUtils.. Socket .connect(... int timeout)

For network timeouts: TCP_SO_TIMEOUT (ms) => NetUtils.. Socket.setSoTimeout( .. ) -or- Users may specify jdbc.Connection.setNetworkTimeout(..) -> NetUtils.. Socket.setSoTimeout(..)

TODO: Make a note that soTimeout should always be higher than the query timeout, or the socket may disconnect before the query returns.

mattrpav commented 5 years ago

@katzyn re KEEP_ALIVE. What do you think about instead defaulting it to 'true'? This would lesson the need for a url paramter.. perhaps move it to a SysProperty where it could be overwritten system-wide.

katzyn commented 5 years ago

H2 does not touch options of the socket, but H2 has some settings that affect initialization of connection. You can change the connection timeout with -Dh2.socketConnectTimeout=10000 and you can change number of retries too. I don't think that they are needed for significant amount of users, so I don't think that we need an additional way to set them.

You may create an additional setting for keep alive, but in other common JDBC drivers this setting is false by default or does not exist at all, perhaps there is a reason for that value? Also you need to determine how socket timeout and keep alive flag will interact with other settings, such as AUTO_RECONNECT, how they will work in cluster mode when client has TCP connections to multiple servers, etc. I think we don't want to have a way that will create an infinite loop when server performs some slow operation, client disconnects and reconnects again, starts the same slow operation again, and so on.

mattrpav commented 5 years ago

re: connectTimeout/socketTimeout I think the difference is being able to set those on a per-connection basis. I agree for the most part, using the global is the way to go. Based on my experience, having the ability to specify the one-off is really valuable.

re: keepAlive my sense is that keepAlive is often set to off by default for legacy/backwards-compatibility reasons. In the current world of lots of layers-- VM's, Containers, Clouds, multi-data center, etc.. this is a good practice to have enabled by default. I think its fine as a global setting vs per-connection. Note: keepAlive modes on socket pairs can be all combinations: client-side only, server-side only, both or neither.

re: AUTO_RECONNECT agreed.. a socketTimeout should be used as a last-resort type deal. keepAlive helps prevent the looping problem by keeping the socket active. socketTimeout should always be queryTimeout plus some extra (2000 ms or so) to prevent the 'long query' problem. socketTimeout should be for cleaning up dead-locked clients or dead-locked servers only. I think both connectTimeout and socketTimeout help the AUTO_RECONNECT use case, so if the first server is dead, the client gets a faster connect to the second server. Having the socketTimeout will also allow a AUTO_RECONNECT client to auto-heal for dead-locked server by timing out and then failing over to the other. For sure, there is a scenario where the client cannot connect to any server and it will spin infinitely. That is a trade-off for auto-healing and that may be viewed as a feature vs a problem.

katzyn commented 5 years ago

I think that existing h2.socketConnectRetry and h2.socketConnectTimeout are more than enough for their purposes. They don't look like frequently needed.

Situation with keep-alive is more complicated. Did you read RFC 1122, especially section 4.2.3.6? https://tools.ietf.org/html/rfc1122#page-101 What if other side does not implement it properly? Also there is no guarantee that both sides will be configured in the same way. Note that there is no per-connection configuration except for enabling or disabling it, all other settings are system-wide and Windows and Linux typically use a 2h interval. For faulty networks it is too large I guess. You may add a setting for it, but I'm not sure that it should be true by default.

Of course, you may implement Connection.setNetworkTimeout() on client side using Socket.setSoTimeout(), possibly with sanity checks of related timeouts (it should be significantly larger than query timeout) and with unit tests. I'm sure that some existing applications use these settings incorrectly and I think that such applications should continue to work somehow. It looks like the server part can't use this TCP setting, because it will lead to connection failure when client application is idle.

nitincchauhan commented 5 years ago

it would be good to have https://github.com/h2database/h2database/issues/243 fixed along this