Azure / azure-cosmosdb-java

Java Async SDK for SQL API of Azure Cosmos DB
MIT License
54 stars 61 forks source link

Connection leak with Cosmos DB client SDK #382

Closed euchungmsft closed 3 years ago

euchungmsft commented 3 years ago

PoolExhaustedException occurred actually from the connection when it tries to get a cosmos db connection from the pool while available memory's decreasing. Apparently it looks like connection leakage

io.reactivex.netty.client.PoolExhaustedException:
   at io.reactivex.netty.client.ConnectionPoolImpl.performAquire (ConnectionPoolImpl.java177)
   at io.reactivex.netty.client.ConnectionPoolImpl.access$300 (ConnectionPoolImpl.java45)
   at io.reactivex.netty.client.ConnectionPoolImpl$1.call (ConnectionPoolImpl.java139)
   at io.reactivex.netty.client.ConnectionPoolImpl$1.call (ConnectionPoolImpl.java124)

Async SDK For SQL API of Azure Cosmos DB Service - 2.6.13 Spring Boot - 2.1.6 Java 11

It sounds like the situation looks similar to this https://github.com/ReactiveX/RxNetty/issues/611

moderakh commented 3 years ago

on v2 sdk, PoolExhaustedException does not indicate a leak. it indicates the number of concurrent requests to GW is more than the configured SDK connection pool-size.

You are on a very old generation of the SDK (v2). our long term recommendation is to move to V4 SDK, where this limitation does not exist: https://docs.microsoft.com/en-us/azure/cosmos-db/sql-api-sdk-java-v4

as remedy for particular PoolExhaustedException issue, please note on v2 SDK, If load from SDK to GW is high and if connectionPool size is not big enough you may run into PoolExhaustedException.

see this: https://docs.microsoft.com/en-us/azure/cosmos-db/troubleshoot-java-async-sdk#connection-pool-exhausted-issue

The following should help to address the issue: 1) increase http connection pool size (default is 1000, you can increase to 2000)

https://github.com/Azure/azure-cosmosdb-java/blob/master/commons/src/main/java/com/microsoft/azure/cosmosdb/ConnectionPolicy.java#L177

2) You are using GW mode, by changing to DirectMode, more traffic will go over tcp and less traffic over http, please change to direct mode:

https://github.com/Azure/azure-cosmosdb-java/blob/master/commons/src/main/java/com/microsoft/azure/cosmosdb/ConnectionPolicy.java#L141

please see the recommendation on direct mode configuration: https://docs.microsoft.com/en-us/azure/cosmos-db/performance-tips-async-java#networking

3) as mentioned this v2 sdk is very old and generations behind. for long-run you can consider moving to v4 SDK, where a lot of improvement were made and this limitation doesn't exist: https://docs.microsoft.com/en-us/azure/cosmos-db/sql-api-sdk-java-v4

euchungmsft commented 3 years ago

Thank you so much for the prompt response