brettwooldridge / HikariCP

光 HikariCP・A solid, high-performance, JDBC connection pool at last.
Apache License 2.0
19.87k stars 2.92k forks source link

Not removing a dead connection from the pool #1819

Open greenwoodct opened 3 years ago

greenwoodct commented 3 years ago

HikariCP Version: 3.4.2, 3.4.5, and DEV

Symptom: Log entries: Caused by: java.sql.SQLRecoverableException: IO Error: Invalid Operation, NOT Connected at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:7676) at oracle.jdbc.driver.PhysicalConnection.setNetworkTimeout(PhysicalConnection.java:9408) at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:556) at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:185) Once getting into the above, connection never recovers.

Root cause: HikariCP 3.4.2 HikariPool.java line 185 HikariCP 3.4.5 HikariPool.java line 186 if (poolEntry.isMarkedEvicted() || (elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs && !isConnectionAlive(poolEntry.connection)))

If the above condition is true, the connection will be removed. But it will never be removed because of the and condition (&&).

Did not test on dev branch. But we may face the same issue. See HikariPool.java line 170: if (poolEntry.isMarkedEvicted() || (elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs && isConnectionDead(poolEntry.connection)))

The above and condition (&&) can cause issues.

Possible solution: For HikariCP 3.4.2 and HikariCP 3.4.5 (Did not try other versions): if (poolEntry.isMarkedEvicted() || elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs || !isConnectionAlive(poolEntry.connection))

For DEV branch: if (poolEntry.isMarkedEvicted() || elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs || isConnectionDead(poolEntry.connection))

Thanks

brettwooldridge commented 3 years ago

@greenwoodct That still doesn't make sense to me. When looking at the 3.4.2 branch (but the same would be true in other branches), the exception thrown in isConnectionAlive() will be caught, and false returned, which should result in the connection being closed.

https://github.com/brettwooldridge/HikariCP/blob/ab2f7209cd6a28e9a3ceba5b86eaa69a09074870/src/main/java/com/zaxxer/hikari/pool/PoolBase.java#L178-L183

EDIT: Just to be clear, the code in HikariPool is:

if ( /*evicted*/ || (/*not recently used*/ && !isConnectionAlive(poolEntry.connection)) )

The logical breakdown of the code is:

The key here being !isConnectionAlive() -- i.e. NOT isConnectionAlive(). In the case of an exception in isConnectionAlive() a value of false is returned. Therefore !isConnectionAlive() evaluates to true, and the connection will be closed.

The purpose of the "if it as not been used in the last 500ms" test is a performance optimization. Testing aliveness is expensive. If the connection has been used within the past 500ms we assume it is still alive. Removing this check, and always checking aliveness is possible, but will definitely have a large impact on performance.

To completely disable the last-used bypass, you can set the system property com.zaxxer.hikari.aliveBypassWindowMs=0.

However, if you are seeing this exception, then clearly the last-used time is greater than 500ms, and thus the isConnectionAlive() method is being called (and returning false), so I still don't see how the connection fails to close.

greenwoodct commented 3 years ago

In spring cloud environment, consul health check leads to frequent connection usage: Consul Health Check --> Spring-Boot-Actuator : Aggregation of /health endpoints --> One of the metrics is DB connection

Due to the Cloud network maintenance needs, network glitches up to five minutes are expected.

Once the isConnectionAlive() returns false, and when elapsedMillis(poolEntry.lastAccessed, now) < aliveBypassWindowMs, next call is: return poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), now);

Now connection returns to Spring-Boot-Actuator and leads to exception due to the connection is bad.

Next, Spring-Boot-Actuator will re-try in a tight loop. the lastAccessed gets updated and staled connections are trapped in the pool forever.

Possible solution: For HikariCP 3.4.2 and HikariCP 3.4.5 (Did not try other versions): if (poolEntry.isMarkedEvicted() || !isConnectionAlive(poolEntry.connection))

For DEV branch: if (poolEntry.isMarkedEvicted() || isConnectionDead(poolEntry.connection))

Or: Adding -Dcom.zaxxer.hikari.aliveBypassWindowMs=0

Thanks

brettwooldridge commented 3 years ago

Ok, let's run through two scenarios.

This is the code (v3.4.2) in getConnection():

if (poolEntry.isMarkedEvicted() || (elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs && !isConnectionAlive(poolEntry.connection))) {
   closeConnection(poolEntry, poolEntry.isMarkedEvicted() ? EVICTED_CONNECTION_MESSAGE : DEAD_CONNECTION_MESSAGE);
   timeout = hardTimeout - elapsedMillis(startTime);
}
else {
   metricsTracker.recordBorrowStats(poolEntry, startTime);
   return poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), now);
}

Scenario 1

None of the connections have been used in the past 500ms (they're basically idle).

Flow:

Scenario 2

All of the connections have been used in the past 500ms (very busy pool).

Flow:

Questions:

This is an important point: Any interaction with a Connection API (or Statement, ResultSet, etc.) that results in a SQLException will result in the Connection being evicted immediately from the pool, if the SQLState starts with "08" or matches one of these codes. You can see an example of such an eviction here.

So still, I cannot see how a dead connection persists in the pool. I'm looking for the loophole, if there is one, that I seem to be missing.

greenwoodct commented 3 years ago

The issues have been lasted for at least a month. May be the issue is not that simple. Error log: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30000ms. at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:82) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:324) at org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator.getProduct(DataSourceHealthIndicator.java:122) at org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator.doDataSourceHealthCheck(DataSourceHealthIndicator.java:107) at org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator.doHealthCheck(DataSourceHealthIndicator.java:102) at org.springframework.boot.actuate.health.AbstractHealthIndicator.health(AbstractHealthIndicator.java:82) at org.springframework.boot.actuate.health.HealthIndicator.getHealth(HealthIndicator.java:37) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:95) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:43) at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:108) at org.springframework.boot.actuate.health.HealthEndpointSupport.getAggregateHealth(HealthEndpointSupport.java:119) at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:105) at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:83) at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:70) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.health(HealthEndpointWebExtension.java:81) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.health(HealthEndpointWebExtension.java:70) at jdk.internal.reflect.GeneratedMethodAccessor86.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:282) at org.springframework.boot.actuate.endpoint.invoke.reflect.ReflectiveOperationInvoker.invoke(ReflectiveOperationInvoker.java:77) at org.springframework.boot.actuate.endpoint.annotation.AbstractDiscoveredOperation.invoke(AbstractDiscoveredOperation.java:60) at org.springframework.boot.actuate.endpoint.web.servlet.AbstractWebMvcEndpointHandlerMapping$ServletWebOperationAdapter.handle(AbstractWebMvcEndpointHandlerMapping.java:305) at org.springframework.boot.actuate.endpoint.web.servlet.AbstractWebMvcEndpointHandlerMapping$OperationHandler.handle(AbstractWebMvcEndpointHandlerMapping.java:388) at jdk.internal.reflect.GeneratedMethodAccessor85.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:105) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:879) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:793) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006) at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898) at javax.servlet.http.HttpServlet.service(HttpServlet.java:634) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883) at javax.servlet.http.HttpServlet.service(HttpServlet.java:741) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:109) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:373) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1594) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30000ms. at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:689) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:196) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:161) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128) at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:158) at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:116) at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:79) ... 76 common frames omitted Caused by: java.sql.SQLRecoverableException: IO Error: Invalid Operation, NOT Connected at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:7676) at oracle.jdbc.driver.PhysicalConnection.setNetworkTimeout(PhysicalConnection.java:9408) at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:556) at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:185) ... 81 common frames omitted Caused by: oracle.net.ns.NetException: Invalid Operation, NOT Connected at oracle.net.nt.TcpNTAdapter.setOption(TcpNTAdapter.java:344) at oracle.net.ns.NSProtocol.setOption(NSProtocol.java:501) at oracle.net.ns.NSProtocol.setSocketReadTimeout(NSProtocol.java:748) at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:7673) ... 85 common frames omitted

2021-08-02 14:26:03,590 WARN DataSource health check failed ...... 2021-08-02 14:26:03,353 WARN DataSource health check failed ...... 2021-08-02 14:26:01,301 WARN DataSource health check failed ...

You are right, it does not look like a HikariCP issue

brettwooldridge commented 3 years ago

Can you enable debug logging for com.zaxxer.hikari? It is not noisy.

greenwoodct commented 3 years ago

Project owner provided a good explanation.

greenwoodct commented 3 years ago

Enabled the following for a few months: spring.datasource.hikari.leakDetectionThreshold=29000 logging.level.com.zaxxer.hikari.HikariConfig=DEBUG

Thanks

greenwoodct commented 3 years ago

We are trying to use: -Dcom.zaxxer.hikari.aliveBypassWindowMs=0

The error logs were shorter: 2021-08-02 14:00:45,395 WARN HikariPool-1 - Failed to validate connection oracle.jdbc.driver.T4CConnection@1d407818 (IO Error: Invalid Operation, NOT Connected). Possibly consider using a shorter maxLifetime value.

Thanks

greenwoodct commented 3 years ago

When -Dcom.zaxxer.hikari.aliveBypassWindowMs=0 Created a lot of log entries 996 events (8/2/21 11:00:00.000 AM to 8/2/21 11:01:00.000 AM)

After reviewing your getConnection function, I prefer to implement it as follow:

public Connection getConnection(final long hardTimeout) throws SQLException { suspendResumeLock.acquire(); final long startTime = currentTime();

  try {
     long timeout = hardTimeout;

     do {
        PoolEntry poolEntry = connectionBag.borrow(timeout, MILLISECONDS);
        if (poolEntry == null) {
           break; // We timed out... break and throw exception
        }
        final long current_step_start_time = currentTime();
        if (poolEntry.isMarkedEvicted()) {
            closeConnection(poolEntry, EVICTED_CONNECTION_MESSAGE);
        }
        else if (!isConnectionAlive(poolEntry.connection)) {
            closeConnection(poolEntry, DEAD_CONNECTION_MESSAGE);
            long time_left = aliveBypassWindowMs - elapsedMillis(current_step_start_time, currentTime());
            if (time_left > 0 ) {
                Thread.sleep(time_left);
            }
        }
        else {
           metricsTracker.recordBorrowStats(poolEntry, startTime);
           return poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), current_step_start_time);
        }
        timeout = hardTimeout - elapsedMillis(startTime);
     } while (timeout > 0L);

     metricsTracker.recordBorrowTimeoutStats(startTime);
     throw createTimeoutException(startTime);
  }
  catch (InterruptedException e) {
     Thread.currentThread().interrupt();
     throw new SQLException(poolName + " - Interrupted during connection acquisition", e);
  }
  finally {
     suspendResumeLock.release();
  }

}

brettwooldridge commented 3 years ago

@greenwoodct Thanks for the feedback. However, you have not demonstrated where and how dead connections are not being removed from the pool.

The log message 2021-08-02 14:00:45,395 WARN HikariPool-1 - Failed to validate connection indicates removal from the pool.

And this...

When -Dcom.zaxxer.hikari.aliveBypassWindowMs=0 Created a lot of log entries 996 events (8/2/21 11:00:00.000 AM to 8/2/21 11:01:00.000 AM)

What is going on? In normal operation there should be tens of logs per hour (when DEBUG is enabled). Can you provide an example of these logs?

greenwoodct commented 3 years ago

The reason for the 996 events in a minute when aliveBypassWindowMs=0 is: getConnection --> isConnectionAlive --> setNetworkTimeout See log: at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:556) at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:185)

setNetworkTimeout throws exception immediately as network has a glitch and the connection is dead.

Now it continue in the do {..} while (timeout > 0L) loop as aliveBypassWindowMs is zero.

Sample log: 2021-08-02 11:00:55,983 WARN HikariPool-1 - Failed to validate connection oracle.jdbc.driver.T4CConnection@6f96b69b (IO Error: Invalid Operation, NOT Connected). Possibly consider using a shorter maxLifetime value. 2021-08-02 11:00:55,622 WARN HikariPool-1 - ... 2021-08-02 11:00:55,602 WARN DataSource health check failed... 2021-08-02 11:00:55,601 WARN HikariPool-1 - ... 2021-08-02 11:00:55,578 WARN HikariPool-1 - ... 2021-08-02 11:00:55,576 WARN HikariPool-1 - ...

NgSekLong commented 3 years ago

Just to echo greenwoodct, we seems to be having similar issue as well on our application on Azure connecting to MariaDB. I will see if I can grab some additional debug logs and show here as well, thanks.

greenwoodct commented 3 years ago

Our issues are also related to Azure environment. Connections were from Azure to on-premise DB servers. Look like it is not a Hikari issue, as Hikari removed all the staled connections. Java TCP Stack failed to recover from the outage.

danielmustafa commented 2 years ago

@greenwoodct Can you expand on what you found? My team is seeing the same behavior (HikariCP losing connections until it runs down to 0 connecting to an Azure DB). By chance, was Microsoft able to provide any resolution, or were they able to pinpoint the issue?

NgSekLong commented 2 years ago

For our case, we mitigated it by adding a socket timeout on the JDBC connection string.

While the root caused is still unknown, at least this solve our issue, see if this applicable to your cases as well, cheers!

dickchung commented 9 months ago

We seems to be the same behavior when we met a network incident(network break 15 mins).

It's interesting about we monitor result after network back: Our Jetty service have 5 connection pools by HikariCP to handle different DB connections of Oracle(x 4) and MySQL(x 1). It's work well on MySQL connection pools after network back, and auto reconnect MySQL smoothly. But it cannot auto reconnect on Oracle connection pools after network back, and always got error as below. Finally, we restart Jetty and then all 5 connection pools work well.

We guess Oracle JDBC driver maybe some connection behavior cannot work smoothly on HikariCP.

Caused by: org.apache.ibatis.exceptions.PersistenceException: 
### Error querying database.  Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
### The error may exist in db/payment2/PayDataMapper.xml
### The error may involve com.***.ecpayment.internal_webapp.dao.payment2.db.mybatis.PsiPayDataMapper.get
### The error occurred while executing a query
### Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
    at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
    at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:153)
    at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:145)
    at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:140)
    at org.apache.ibatis.session.defaults.DefaultSqlSession.selectOne(DefaultSqlSession.java:76)
    at jdk.internal.reflect.GeneratedMethodAccessor99.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:427)
    ... 105 common frames omitted
Caused by: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
    at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:83)
    at org.mybatis.spring.transaction.SpringManagedTransaction.openConnection(SpringManagedTransaction.java:80)
    at org.mybatis.spring.transaction.SpringManagedTransaction.getConnection(SpringManagedTransaction.java:67)
    at org.apache.ibatis.executor.BaseExecutor.getConnection(BaseExecutor.java:337)
    at org.apache.ibatis.executor.SimpleExecutor.prepareStatement(SimpleExecutor.java:86)
    at org.apache.ibatis.executor.SimpleExecutor.doQuery(SimpleExecutor.java:62)
    at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:325)
    at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156)
    at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:136)
    at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:151)
    ... 112 common frames omitted
Caused by: java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
    at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:696)
    at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:197)
    at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162)
    at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128)
    at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:159)
    at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:117)
    at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:80)
    ... 121 common frames omitted
Caused by: java.sql.SQLRecoverableException: IO Error: Invalid Operation, NOT Connected
    at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:10041)
    at oracle.jdbc.driver.PhysicalConnection.setNetworkTimeout(PhysicalConnection.java:10237)
    at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:566)
    at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:173)
    at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:186)
    ... 126 common frames omitted
Caused by: oracle.net.ns.NetException: Invalid Operation, NOT Connected
    at oracle.net.nt.TcpNTAdapter.setOption(TcpNTAdapter.java:747)
    at oracle.net.ns.NSProtocol.setOption(NSProtocol.java:725)
    at oracle.net.ns.NSProtocol.setSocketReadTimeout(NSProtocol.java:1040)
    at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:10038)
    ... 130 common frames omitted

lib version:

BOOT-INF/lib/mysql-connector-java-8.0.28.jar
BOOT-INF/lib/ojdbc11-21.5.0.0.jar
BOOT-INF/lib/HikariCP-4.0.3.jar
BOOT-INF/lib/mybatis-3.5.11.jar
BOOT-INF/lib/mybatis-spring-2.0.7.jar
BOOT-INF/lib/mybatis-spring-boot-starter-2.2.2.jar

MySQL connection pool config:

spring.datasource.invoice.driver-class-name: com.mysql.cj.jdbc.Driver
spring.datasource.invoice.username: ***
spring.datasource.invoice.password: ***
spring.datasource.invoice.poolName: invoice
spring.datasource.invoice.minimumIdle: 10
spring.datasource.invoice.maximumPoolSize: 50
spring.datasource.invoice.idleTimeout: 20000
spring.datasource.invoice.maxLifetime: 30000
spring.datasource.invoice.connectionTimeout: 5000
spring.datasource.invoice.leakDetectionThreshold: 20000

one of Oracle connection pool config:

spring.datasource.payment2.driver-class-name: oracle.jdbc.OracleDriver
spring.datasource.payment2.username: ***
spring.datasource.payment2.password: ***
spring.datasource.payment2.poolName: payment2
spring.datasource.payment2.minimumIdle: 10
spring.datasource.payment2.maximumPoolSize: 50
spring.datasource.payment2.idleTimeout: 20000
spring.datasource.payment2.maxLifetime: 30000
spring.datasource.payment2.connectionTimeout: 5000
spring.datasource.payment2.leakDetectionThreshold: 20000
AmirhosseinZDev commented 7 months ago

We faced similar situation when our database(Oracle) is down for a while and then going up again, we use kuberneties in our producation enviroment and it calls readiness endpoint of actuator every 10 seconds when db is down, does anybody find solution for this issue?