Open greenwoodct opened 3 years ago
@greenwoodct That still doesn't make sense to me. When looking at the 3.4.2 branch (but the same would be true in other branches), the exception thrown in isConnectionAlive()
will be caught, and false
returned, which should result in the connection being closed.
EDIT: Just to be clear, the code in HikariPool is:
if ( /*evicted*/ || (/*not recently used*/ && !isConnectionAlive(poolEntry.connection)) )
The logical breakdown of the code is:
The key here being !isConnectionAlive()
-- i.e. NOT isConnectionAlive()
. In the case of an exception in isConnectionAlive()
a value of false
is returned. Therefore !isConnectionAlive()
evaluates to true
, and the connection will be closed.
The purpose of the "if it as not been used in the last 500ms" test is a performance optimization. Testing aliveness is expensive. If the connection has been used within the past 500ms we assume it is still alive. Removing this check, and always checking aliveness is possible, but will definitely have a large impact on performance.
To completely disable the last-used bypass, you can set the system property com.zaxxer.hikari.aliveBypassWindowMs=0
.
However, if you are seeing this exception, then clearly the last-used time is greater than 500ms, and thus the isConnectionAlive()
method is being called (and returning false
), so I still don't see how the connection fails to close.
In spring cloud environment, consul health check leads to frequent connection usage: Consul Health Check --> Spring-Boot-Actuator : Aggregation of /health endpoints --> One of the metrics is DB connection
Due to the Cloud network maintenance needs, network glitches up to five minutes are expected.
Once the isConnectionAlive() returns false, and when elapsedMillis(poolEntry.lastAccessed, now) < aliveBypassWindowMs, next call is: return poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), now);
Now connection returns to Spring-Boot-Actuator and leads to exception due to the connection is bad.
Next, Spring-Boot-Actuator will re-try in a tight loop. the lastAccessed gets updated and staled connections are trapped in the pool forever.
Possible solution: For HikariCP 3.4.2 and HikariCP 3.4.5 (Did not try other versions): if (poolEntry.isMarkedEvicted() || !isConnectionAlive(poolEntry.connection))
For DEV branch: if (poolEntry.isMarkedEvicted() || isConnectionDead(poolEntry.connection))
Or: Adding -Dcom.zaxxer.hikari.aliveBypassWindowMs=0
Thanks
Ok, let's run through two scenarios.
This is the code (v3.4.2) in getConnection()
:
if (poolEntry.isMarkedEvicted() || (elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs && !isConnectionAlive(poolEntry.connection))) {
closeConnection(poolEntry, poolEntry.isMarkedEvicted() ? EVICTED_CONNECTION_MESSAGE : DEAD_CONNECTION_MESSAGE);
timeout = hardTimeout - elapsedMillis(startTime);
}
else {
metricsTracker.recordBorrowStats(poolEntry, startTime);
return poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), now);
}
None of the connections have been used in the past 500ms (they're basically idle).
Flow:
HikariDataSource.getConnection()
poolEntry.isMarkedEvicted()
is false
, ||
condition continueselapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs
is true
, &&
condition continuesisConnectionAlive(poolEntry.connection)
is false
because java.sql.SQLRecoverableException: IO Error: Invalid Operation, NOT Connected is thrown and caughtcloseConnection(poolEntry, poolEntry.isMarkedEvicted() ...)
is calledAll of the connections have been used in the past 500ms (very busy pool).
Flow:
HikariDataSource.getConnection()
poolEntry.isMarkedEvicted()
is false
, ||
condition continueselapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs
is false
, &&
condition is never testedelse
branch taken, poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), now)
is returnedQuestions:
Connection
is returned from the pool, why does the stacktrace you included indicate that it occurred in isConnectionAlive()
? As far as I can tell, there is no way for that exception to occur in isConnectionAlive()
and for that Connection
to be returned.Connection
?This is an important point: Any interaction with a Connection
API (or Statement
, ResultSet
, etc.) that results in a SQLException
will result in the Connection
being evicted immediately from the pool, if the SQLState starts with "08" or matches one of these codes. You can see an example of such an eviction here.
So still, I cannot see how a dead connection persists in the pool. I'm looking for the loophole, if there is one, that I seem to be missing.
The issues have been lasted for at least a month. May be the issue is not that simple. Error log: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30000ms. at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:82) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:324) at org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator.getProduct(DataSourceHealthIndicator.java:122) at org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator.doDataSourceHealthCheck(DataSourceHealthIndicator.java:107) at org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator.doHealthCheck(DataSourceHealthIndicator.java:102) at org.springframework.boot.actuate.health.AbstractHealthIndicator.health(AbstractHealthIndicator.java:82) at org.springframework.boot.actuate.health.HealthIndicator.getHealth(HealthIndicator.java:37) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:95) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:43) at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:108) at org.springframework.boot.actuate.health.HealthEndpointSupport.getAggregateHealth(HealthEndpointSupport.java:119) at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:105) at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:83) at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:70) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.health(HealthEndpointWebExtension.java:81) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.health(HealthEndpointWebExtension.java:70) at jdk.internal.reflect.GeneratedMethodAccessor86.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:282) at org.springframework.boot.actuate.endpoint.invoke.reflect.ReflectiveOperationInvoker.invoke(ReflectiveOperationInvoker.java:77) at org.springframework.boot.actuate.endpoint.annotation.AbstractDiscoveredOperation.invoke(AbstractDiscoveredOperation.java:60) at org.springframework.boot.actuate.endpoint.web.servlet.AbstractWebMvcEndpointHandlerMapping$ServletWebOperationAdapter.handle(AbstractWebMvcEndpointHandlerMapping.java:305) at org.springframework.boot.actuate.endpoint.web.servlet.AbstractWebMvcEndpointHandlerMapping$OperationHandler.handle(AbstractWebMvcEndpointHandlerMapping.java:388) at jdk.internal.reflect.GeneratedMethodAccessor85.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:105) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:879) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:793) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006) at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898) at javax.servlet.http.HttpServlet.service(HttpServlet.java:634) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883) at javax.servlet.http.HttpServlet.service(HttpServlet.java:741) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:109) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:373) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1594) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30000ms. at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:689) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:196) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:161) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128) at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:158) at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:116) at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:79) ... 76 common frames omitted Caused by: java.sql.SQLRecoverableException: IO Error: Invalid Operation, NOT Connected at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:7676) at oracle.jdbc.driver.PhysicalConnection.setNetworkTimeout(PhysicalConnection.java:9408) at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:556) at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:185) ... 81 common frames omitted Caused by: oracle.net.ns.NetException: Invalid Operation, NOT Connected at oracle.net.nt.TcpNTAdapter.setOption(TcpNTAdapter.java:344) at oracle.net.ns.NSProtocol.setOption(NSProtocol.java:501) at oracle.net.ns.NSProtocol.setSocketReadTimeout(NSProtocol.java:748) at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:7673) ... 85 common frames omitted
2021-08-02 14:26:03,590 WARN DataSource health check failed ...... 2021-08-02 14:26:03,353 WARN DataSource health check failed ...... 2021-08-02 14:26:01,301 WARN DataSource health check failed ...
You are right, it does not look like a HikariCP issue
Can you enable debug logging for com.zaxxer.hikari
? It is not noisy.
Project owner provided a good explanation.
Enabled the following for a few months: spring.datasource.hikari.leakDetectionThreshold=29000 logging.level.com.zaxxer.hikari.HikariConfig=DEBUG
Thanks
We are trying to use: -Dcom.zaxxer.hikari.aliveBypassWindowMs=0
The error logs were shorter: 2021-08-02 14:00:45,395 WARN HikariPool-1 - Failed to validate connection oracle.jdbc.driver.T4CConnection@1d407818 (IO Error: Invalid Operation, NOT Connected). Possibly consider using a shorter maxLifetime value.
Thanks
When -Dcom.zaxxer.hikari.aliveBypassWindowMs=0 Created a lot of log entries 996 events (8/2/21 11:00:00.000 AM to 8/2/21 11:01:00.000 AM)
After reviewing your getConnection function, I prefer to implement it as follow:
public Connection getConnection(final long hardTimeout) throws SQLException { suspendResumeLock.acquire(); final long startTime = currentTime();
try {
long timeout = hardTimeout;
do {
PoolEntry poolEntry = connectionBag.borrow(timeout, MILLISECONDS);
if (poolEntry == null) {
break; // We timed out... break and throw exception
}
final long current_step_start_time = currentTime();
if (poolEntry.isMarkedEvicted()) {
closeConnection(poolEntry, EVICTED_CONNECTION_MESSAGE);
}
else if (!isConnectionAlive(poolEntry.connection)) {
closeConnection(poolEntry, DEAD_CONNECTION_MESSAGE);
long time_left = aliveBypassWindowMs - elapsedMillis(current_step_start_time, currentTime());
if (time_left > 0 ) {
Thread.sleep(time_left);
}
}
else {
metricsTracker.recordBorrowStats(poolEntry, startTime);
return poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), current_step_start_time);
}
timeout = hardTimeout - elapsedMillis(startTime);
} while (timeout > 0L);
metricsTracker.recordBorrowTimeoutStats(startTime);
throw createTimeoutException(startTime);
}
catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new SQLException(poolName + " - Interrupted during connection acquisition", e);
}
finally {
suspendResumeLock.release();
}
}
@greenwoodct Thanks for the feedback. However, you have not demonstrated where and how dead connections are not being removed from the pool.
The log message 2021-08-02 14:00:45,395 WARN HikariPool-1 - Failed to validate connection
indicates removal from the pool.
And this...
When -Dcom.zaxxer.hikari.aliveBypassWindowMs=0 Created a lot of log entries 996 events (8/2/21 11:00:00.000 AM to 8/2/21 11:01:00.000 AM)
What is going on? In normal operation there should be tens of logs per hour (when DEBUG is enabled). Can you provide an example of these logs?
The reason for the 996 events in a minute when aliveBypassWindowMs=0 is: getConnection --> isConnectionAlive --> setNetworkTimeout See log: at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:556) at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:185)
setNetworkTimeout throws exception immediately as network has a glitch and the connection is dead.
Now it continue in the do {..} while (timeout > 0L) loop as aliveBypassWindowMs is zero.
Sample log: 2021-08-02 11:00:55,983 WARN HikariPool-1 - Failed to validate connection oracle.jdbc.driver.T4CConnection@6f96b69b (IO Error: Invalid Operation, NOT Connected). Possibly consider using a shorter maxLifetime value. 2021-08-02 11:00:55,622 WARN HikariPool-1 - ... 2021-08-02 11:00:55,602 WARN DataSource health check failed... 2021-08-02 11:00:55,601 WARN HikariPool-1 - ... 2021-08-02 11:00:55,578 WARN HikariPool-1 - ... 2021-08-02 11:00:55,576 WARN HikariPool-1 - ...
Just to echo greenwoodct, we seems to be having similar issue as well on our application on Azure connecting to MariaDB. I will see if I can grab some additional debug logs and show here as well, thanks.
Our issues are also related to Azure environment. Connections were from Azure to on-premise DB servers. Look like it is not a Hikari issue, as Hikari removed all the staled connections. Java TCP Stack failed to recover from the outage.
@greenwoodct Can you expand on what you found? My team is seeing the same behavior (HikariCP losing connections until it runs down to 0 connecting to an Azure DB). By chance, was Microsoft able to provide any resolution, or were they able to pinpoint the issue?
For our case, we mitigated it by adding a socket timeout on the JDBC connection string.
While the root caused is still unknown, at least this solve our issue, see if this applicable to your cases as well, cheers!
We seems to be the same behavior when we met a network incident(network break 15 mins).
It's interesting about we monitor result after network back: Our Jetty service have 5 connection pools by HikariCP to handle different DB connections of Oracle(x 4) and MySQL(x 1). It's work well on MySQL connection pools after network back, and auto reconnect MySQL smoothly. But it cannot auto reconnect on Oracle connection pools after network back, and always got error as below. Finally, we restart Jetty and then all 5 connection pools work well.
We guess Oracle JDBC driver maybe some connection behavior cannot work smoothly on HikariCP.
Caused by: org.apache.ibatis.exceptions.PersistenceException:
### Error querying database. Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
### The error may exist in db/payment2/PayDataMapper.xml
### The error may involve com.***.ecpayment.internal_webapp.dao.payment2.db.mybatis.PsiPayDataMapper.get
### The error occurred while executing a query
### Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:153)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:145)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:140)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectOne(DefaultSqlSession.java:76)
at jdk.internal.reflect.GeneratedMethodAccessor99.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:427)
... 105 common frames omitted
Caused by: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:83)
at org.mybatis.spring.transaction.SpringManagedTransaction.openConnection(SpringManagedTransaction.java:80)
at org.mybatis.spring.transaction.SpringManagedTransaction.getConnection(SpringManagedTransaction.java:67)
at org.apache.ibatis.executor.BaseExecutor.getConnection(BaseExecutor.java:337)
at org.apache.ibatis.executor.SimpleExecutor.prepareStatement(SimpleExecutor.java:86)
at org.apache.ibatis.executor.SimpleExecutor.doQuery(SimpleExecutor.java:62)
at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:325)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:136)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:151)
... 112 common frames omitted
Caused by: java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:696)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:197)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128)
at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:159)
at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:117)
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:80)
... 121 common frames omitted
Caused by: java.sql.SQLRecoverableException: IO Error: Invalid Operation, NOT Connected
at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:10041)
at oracle.jdbc.driver.PhysicalConnection.setNetworkTimeout(PhysicalConnection.java:10237)
at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:566)
at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:173)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:186)
... 126 common frames omitted
Caused by: oracle.net.ns.NetException: Invalid Operation, NOT Connected
at oracle.net.nt.TcpNTAdapter.setOption(TcpNTAdapter.java:747)
at oracle.net.ns.NSProtocol.setOption(NSProtocol.java:725)
at oracle.net.ns.NSProtocol.setSocketReadTimeout(NSProtocol.java:1040)
at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:10038)
... 130 common frames omitted
lib version:
BOOT-INF/lib/mysql-connector-java-8.0.28.jar
BOOT-INF/lib/ojdbc11-21.5.0.0.jar
BOOT-INF/lib/HikariCP-4.0.3.jar
BOOT-INF/lib/mybatis-3.5.11.jar
BOOT-INF/lib/mybatis-spring-2.0.7.jar
BOOT-INF/lib/mybatis-spring-boot-starter-2.2.2.jar
MySQL connection pool config:
spring.datasource.invoice.driver-class-name: com.mysql.cj.jdbc.Driver
spring.datasource.invoice.username: ***
spring.datasource.invoice.password: ***
spring.datasource.invoice.poolName: invoice
spring.datasource.invoice.minimumIdle: 10
spring.datasource.invoice.maximumPoolSize: 50
spring.datasource.invoice.idleTimeout: 20000
spring.datasource.invoice.maxLifetime: 30000
spring.datasource.invoice.connectionTimeout: 5000
spring.datasource.invoice.leakDetectionThreshold: 20000
one of Oracle connection pool config:
spring.datasource.payment2.driver-class-name: oracle.jdbc.OracleDriver
spring.datasource.payment2.username: ***
spring.datasource.payment2.password: ***
spring.datasource.payment2.poolName: payment2
spring.datasource.payment2.minimumIdle: 10
spring.datasource.payment2.maximumPoolSize: 50
spring.datasource.payment2.idleTimeout: 20000
spring.datasource.payment2.maxLifetime: 30000
spring.datasource.payment2.connectionTimeout: 5000
spring.datasource.payment2.leakDetectionThreshold: 20000
We faced similar situation when our database(Oracle) is down for a while and then going up again, we use kuberneties in our producation enviroment and it calls readiness endpoint of actuator every 10 seconds when db is down, does anybody find solution for this issue?
HikariCP Version: 3.4.2, 3.4.5, and DEV
Symptom: Log entries: Caused by: java.sql.SQLRecoverableException: IO Error: Invalid Operation, NOT Connected at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:7676) at oracle.jdbc.driver.PhysicalConnection.setNetworkTimeout(PhysicalConnection.java:9408) at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:556) at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:185) Once getting into the above, connection never recovers.
Root cause: HikariCP 3.4.2 HikariPool.java line 185 HikariCP 3.4.5 HikariPool.java line 186 if (poolEntry.isMarkedEvicted() || (elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs && !isConnectionAlive(poolEntry.connection)))
If the above condition is true, the connection will be removed. But it will never be removed because of the and condition (&&).
Did not test on dev branch. But we may face the same issue. See HikariPool.java line 170: if (poolEntry.isMarkedEvicted() || (elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs && isConnectionDead(poolEntry.connection)))
The above and condition (&&) can cause issues.
Possible solution: For HikariCP 3.4.2 and HikariCP 3.4.5 (Did not try other versions): if (poolEntry.isMarkedEvicted() || elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs || !isConnectionAlive(poolEntry.connection))
For DEV branch: if (poolEntry.isMarkedEvicted() || elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs || isConnectionDead(poolEntry.connection))
Thanks