Closed haimadrian closed 1 year ago
Hi @haimadrian
Thank you for including both the stack trace as well as sample code. This will greatly assist us in resolving the issue. We will be looking into it and get back to you as soon as we can with either a solution, or a request for more information.
The situation you are describing sounds similar to what other users have encountered. Are you able to try other versions of the driver, specifically the latest 11.1.1 version? Based on the results, this will help narrow down next steps.
Hi @Jeffery-Wasty and thanks for your reply.
I've just double checked it, using latest 11.1.1.0-preview
version, and the issue is still reproducible.
Attached JDBC trace logs: MSSQLJdbcDemo.log
Something interesting I noticed is that when the issue happens, the first SQLServerException's message describes that the error is: "connect timed out", which is the expected behavior. Though the second SQLServerException, caught by the second thread, describes that the error is: "null".
If I synchronize the code calling DriverManager.getConnection()
then both threads receive the exact same exception with reason: "connect timed out" (the expected behavior)
Hi @haimadrian,
An update on this issue: we're still working towards a resolution. Part of the problem being currently faced is that we're having trouble reproducing the issue. You did include sample code up above that should be able to work without connecting to an actual instance, but still we're not able to reproduce the issue seen even with this code. Can I ask you to confirm that the above demonstration truly does recreate the issue described, after a set amount of time. Thanks.
Hi @Jeffery-Wasty I've just played with it a little bit and I see that the issue happens when trying to connect to localhost, on Windows. I tried to set up a SQL Server on Azure, to share it with you, but the issue was not reproduced.
It seems that when SocketFinder
tries to find socket, and it uses InetAddress.getAllByName(hostName)
, where hostName="localhost", it then receives two results, and not one. This then yields to multiple executions later, in findSocketUsingThreading
method, which creates two threads (for the two addresses) and not one. And when this scenario happens in a loop, there are thousands of executions.
The results from InetAddress:
- localhost/127.0.0.1
- localhost/0:0:0:0:0:0:0:1
Here I recorded my screen while reproducing the issue: https://user-images.githubusercontent.com/20064967/168935141-2344004a-48ed-4190-ad6e-fc7ca43f3ece.mp4 In the demo there is a short time-out, so the issue is resolved quickly. But in production the time-out is set to 1-1.5 minutes, so it is more problematic there.
Folder containing files from the demo: (mssql-jdbc-test-1.0 and mssql-jdbc-11.1.1.jre8-preview) MSSQLIssue.zip
Sources of mssql-jdbc-test-1.0: mssql-jdbc-test-1.0-sources.zip
Hi @haimadrian,
An update on this issue. We're able to spend more time on this issue now, and hope to deliver a solution soon. In the meantime, please confirm that the problem still persists with the latest driver version (11.2.0).
Hi @Jeffery-Wasty Issue is still reproducible.
@haimadrian A note to the demo program: The ThreadMXBean thread should emit ManagementFactory.getThreadMXBean().getThreadCount() for the current number of running threads. The started threads all stops within a short time frame, but the huge number of threads consume a lot of OS resources.
@Jeffery-Wasty This issue has a big impact, as it consumes a lot of resources in Windows. We experience running software to consume 2GB and more just for the threads, where the JVM runs the application in 256MB heap. Furthermore it affects not only the running application in the JVM, but disturbs all kind of other network activity with IP connections of other applications running out of resources, as the dynamic ports for outbound sockets are exhausted. The destination is SQL Server 2019 HA cluster with MultiSubnetFailover, so there are for sure more IP adresses for the DB and the issue occurs very often.
With a 64 bit JVM, the application consumes a huge amount of CPU and network resources for about 15 seconds. With a 32 bit JVM, the application runs out of resources with the following stack trace:
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached at java.lang.Thread.start0(Native Method) ~[?:?] at java.lang.Thread.start(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.addWorker(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) ~[?:?] at com.microsoft.sqlserver.jdbc.SocketFinder.findSocketUsingThreading(IOBuffer.java:2721) ~[mssql-jdbc-9.4.0.jre11.jar:?] at com.microsoft.sqlserver.jdbc.SocketFinder.findSocket(IOBuffer.java:2415) ~[mssql-jdbc-9.4.0.jre11.jar:?] at com.microsoft.sqlserver.jdbc.TDSChannel.open(IOBuffer.java:676) ~[mssql-jdbc-9.4.0.jre11.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:2957) ~[mssql-jdbc-9.4.0.jre11.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:2628) ~[mssql-jdbc-9.4.0.jre11.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:2471) ~[mssql-jdbc-9.4.0.jre11.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:1470) ~[mssql-jdbc-9.4.0.jre11.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:915) ~[mssql-jdbc-9.4.0.jre11.jar:?] at java.sql.DriverManager.getConnection(Unknown Source) ~[java.sql:?] at java.sql.DriverManager.getConnection(Unknown Source) ~[java.sql:?]
The problem still exists in latest driver version 12.2.0
I suspended the application in the debugger after the first connection timeout occured, and all the threads in the threadpool of SocketFinder where in the following state:
Thread [pool-1-thread-1] (Suspended) owns: SocksSocketImpl (id=564) PlainSocketImpl.waitForConnect(int, int) line: not available [native method] SocksSocketImpl(PlainSocketImpl).socketConnect(InetAddress, int, int) line: 107 SocksSocketImpl(AbstractPlainSocketImpl).doConnect(InetAddress, int, int) line: 399 SocksSocketImpl(AbstractPlainSocketImpl).connectToAddress(InetAddress, int, int) line: 242 SocksSocketImpl(AbstractPlainSocketImpl).connect(SocketAddress, int) line: 224 SocksSocketImpl.connect(SocketAddress, int) line: 392 Socket.connect(SocketAddress, int) line: 609 SocketConnector.run() line: 3240 ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) line: 1128 ThreadPoolExecutor$Worker.run() line: 628 Thread.run() line: 829
Find a thread dump attached. threaddump-1677071134843.zip
Hi @dl6lr,
Thank you for the update on this issue. We're revisiting it currently, so the additional information is very helpful. Can you let us know which versions you are using for SQL Server, Client OS, and Java version? Is it the same as the original user's?
Hi @Jeffery-Wasty ,
thank you for your response. Here some Informations:
SQL-Server:
Clients (about 60 systems):
The clients are OLTP systems. If you need further informations, I will try to provide them.
Hello, I'm having an issue with mssql-jdbc driver in multi-threaded environment.
Driver version
10.2.0.jre8
SQL Server version
Microsoft SQL Server 2019 (RTM-GDR) (KB4583458) - 15.0.2080.9 (X64) Nov 6 2020 16:50:01 Copyright (C) 2019 Microsoft Corporation Developer Edition (64-bit) on Windows 10 Enterprise 10.0 (Build 19044: )
Client Operating System
Windows 10 Enterprise 21H2 19044.1706 Processor: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz 2.59 GHz (6 cores, 12 logical) RAM: 32GB
JAVA/JVM version
openjdk version "1.8.0_222" OpenJDK Runtime Environment (Zulu 8.40.0.25-CA-win64) (build 1.8.0_222-b10) OpenJDK 64-Bit Server VM (Zulu 8.40.0.25-CA-win64) (build 25.222-b10, mixed mode)
Table schema
N/A
Problem description
The issue is that the thread pool of
com.microsoft.sqlserver.jdbc.SocketFinder
is not limited, and it creates thousands of threads, causing the JVM to consume all available CPU of a machine. The issue is reproducible when two threads are trying to connect to a stopped instance, and the first thread receives a TimeOut exception, and interrupt the second thread while it is waiting for its time-out to occur. I have prepared a short demonstration to reproduce the issue. You do not really have to connect to a real instance. I could overcome this by synchronizing the access toDriverManager.getConnection(...)
, and some sleep (250 milliseconds). Without the sleep the issue is still reproducible. I would like to avoid synchronizing the access toDriverManager
, to let multiple threads work concurrently. The issue is also reproducible when usingcommons-dbcp
, and this time synchronizing the access tobasicDataSource.getConnection()
does not help to overcome the issue.Expected behavior
SocketFinder
's thread pool to make sure it will not pass X threads. And if X is configurable, that will be perfect.ConnectException
), bubble it to the caller immediately, or after Y attempts, rather than looping in a busy-waiting implementation that causes thousands of calls toSocketFinder.findSocketUsingThreading
.Actual behavior
7K threads and ~80% CPU usage. Output from my demo: (Notice the ~7K threads)
Error message/stack trace
N/A
Any other details that can be helpful
You can run this code to reproduce the issue.
JDBC trace logs
Modifying log level affects the issue:
MSSQLJdbcDemo.log