alibaba / druid

阿里云计算平台DataWorks(https://help.aliyun.com/document_detail/137663.html) 团队出品,为监控而生的数据库连接池
https://github.com/alibaba/druid/wiki
Apache License 2.0
27.97k stars 8.58k forks source link

数据库连接假死,程序一直持该连接,实际上该连接已经超时并且不可用, 但是应用一直持用导致应用线程卡死 #1523

Open missvivia opened 7 years ago

missvivia commented 7 years ago

数据库为Oracle具体信息:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production PL/SQL Release 11.2.0.4.0 - Production "CORE 11.2.0.4.0 Production" TNS for Linux: Version 11.2.0.4.0 - Production NLSRTL Version 11.2.0.4.0 - Production

使用的Druid版本为:1.0.18

应用细节:

  1. 应用后台线程栈(线程长时间永久等待):

"DefaultMessageListenerContainer-1" prio=10 tid=0x00007f027bee6800 nid=0x434a runnable [0x00007f025e075000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at oracle.net.ns.Packet.receive(Packet.java:308) at oracle.net.ns.DataPacket.receive(DataPacket.java:106) at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:324) at oracle.net.ns.NetInputStream.read(NetInputStream.java:268) at oracle.net.ns.NetInputStream.read(NetInputStream.java:190) at oracle.net.ns.NetInputStream.read(NetInputStream.java:107) at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:124) at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:80) at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1137) at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:350) at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227) at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531) at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:195) at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:876) at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1175) at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1296) at oracle.jdbc.driver.OracleStatement.executeQuery(OracleStatement.java:1498)

  1. 后台伴随有数据库连接超时异常:

java.sql.SQLRecoverableException: IO Error: Connection timed out at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:889) ~[ojdbc6-11.2.0.3.0.jar!/:11.2.0.4.0] at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1175) ~[ojdbc6-11.2.0.3.0.jar!/:11.2.0.4.0] at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1296) ~[ojdbc6-11.2.0.3.0.jar!/:11.2.0.4.0] at oracle.jdbc.driver.OracleStatement.executeQuery(OracleStatement.java:1498) ~[ojdbc6-11.2.0.3.0.jar!/:11.2.0.4.0]

Caused by: java.net.SocketException: Connection timed out at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:152) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:122) ~[na:1.7.0_79] at oracle.net.ns.Packet.receive(Packet.java:308) ~[ojdbc6-11.2.0.3.0.jar!/:11.2.0.4.0

2016-11-30 19:16:19.452 ERROR 17198 --- [enerContainer-1] druid.sql.Statement : {conn-10002, stmt-20478} execute error. SELECT 1 FROM DUAL

  1. 从Druid后台监控观察到连接池中连接信息(目前应用只需要一个连接所以只有一个):

image

  1. 数据库连接池配置:

image

image

yuleizhuai commented 7 years ago

目前应用只需要一个连接所以只有一个

我也遇到过类似问题,当你需要一个连接时,其实连接池就是大材小用了。 但为了避免出错解决长时间不访问应用,再访问又连接不上了数据库了的问题可以使用以下方法:

1)<property name="testOnBorrow" value="true" /> 默认值是 true ,当从连接池取连接时,验证这个连接是否有效

2)配置minIdle=0能解决此问题

Danier-Evens commented 7 years ago

我也碰到了。我使用的使用的Druid版本为:1.0.25

异常如下: “locked <0x00000006ccc2e290> (a java.io.BufferedInputStream)”

"process-cron-job-thread-for-quota-id=342" #1401 prio=5 os_prio=0 tid=0x00007f3cc8202800 nid=0x5323 runnable [0x00007f3cbbaf9000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345)

"process-cron-job-thread-for-quota-id=375" #1399 prio=5 os_prio=0 tid=0x00007f3cc896c800 nid=0x5321 runnable [0x00007f3cbb3f2000]

othorizon commented 5 years ago

我也碰到了。我使用的使用的Druid版本为:1.0.25

异常如下: “locked <0x00000006ccc2e290> (a java.io.BufferedInputStream)”

"process-cron-job-thread-for-quota-id=342" #1401 prio=5 os_prio=0 tid=0x00007f3cc8202800 nid=0x5323 runnable [0x00007f3cbbaf9000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345)

  • locked <0x00000006ccc2e290> (a java.io.BufferedInputStream) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:376) at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:453) at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:435) at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_GetOperationStatus(TCLIService.java:409) at org.apache.hive.service.cli.thrift.TCLIService$Client.GetOperationStatus(TCLIService.java:396) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:280) at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:392) at com.alibaba.druid.pool.DruidPooledStatement.executeQuery(DruidPooledStatement.java:138) at org.springframework.jdbc.core.JdbcTemplate$1QueryStatementCallback.doInStatement(JdbcTemplate.java:455) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:405) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:471) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:481) at org.springframework.jdbc.core.JdbcTemplate.queryForList(JdbcTemplate.java:521) at com.souche.datacenter.biz.common.dao.HiveOrSparkDao.querySqlFormSpark(HiveOrSparkDao.java:29) at com.souche.datacenter.biz.metadata.job.ExtractHiveCronService.processQueryFromSparkOrHive(ExtractHiveCronService.java:493) at com.souche.datacenter.biz.metadata.job.ExtractHiveCronService.executeHiveOrSpark(ExtractHiveCronService.java:808) at com.souche.datacenter.biz.metadata.job.ExtractHiveCronService.lambda$null$20(ExtractHiveCronService.java:626) at com.souche.datacenter.biz.metadata.job.ExtractHiveCronService$$Lambda$89/156598377.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:

  • <0x00000006ccc27250> (a java.util.concurrent.ThreadPoolExecutor$Worker)
  • <0x00000006ccd9eae0> (a java.util.concurrent.locks.ReentrantLock$FairSync)

"process-cron-job-thread-for-quota-id=375" #1399 prio=5 os_prio=0 tid=0x00007f3cc896c800 nid=0x5321 runnable [0x00007f3cbb3f2000]

16年的问题。。。现在也遇到这个问题了,,,也是hive产生的,,,所以,这到底是怎么回事啊,,,研究了好久了没整明白。。。