spring-projects / spring-data-mongodb

Provides support to increase developer productivity in Java when using MongoDB. Uses familiar Spring concepts such as a template classes for core API usage and lightweight repository style data access.
https://spring.io/projects/spring-data-mongodb/
Apache License 2.0
1.62k stars 1.09k forks source link

DataAccessResourceFailureException #4202

Closed sunnycalsoft closed 1 year ago

sunnycalsoft commented 2 years ago
{"time":"2022-10-12T04:16:41,766Z","level":"ERROR","thread":"https-jsse-nio-8080-exec-2","logger":"org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/].[dispatcherServlet]","message":"Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.dao.DataAccessResourceFailureException: Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb:27017.; nested exception is com.mongodb.MongoTimeoutException: Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb:27017.] with root cause","context":"default","exception":"com.mongodb.MongoTimeoutException: Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb:27017.
\tat com.mongodb.internal.connection.DefaultConnectionPool.createTimeoutException(DefaultConnectionPool.java:330)
\tat com.mongodb.internal.connection.DefaultConnectionPool.access$2000(DefaultConnectionPool.java:106)
\tat com.mongodb.internal.connection.DefaultConnectionPool$OpenConcurrencyLimiter.acquirePermitOrGetAvailableOpenedConnection(DefaultConnectionPool.java:1003)
\tat com.mongodb.internal.connection.DefaultConnectionPool$OpenConcurrencyLimiter.openAsyncWithConcurrencyLimit(DefaultConnectionPool.java:941)
\tat com.mongodb.internal.connection.DefaultConnectionPool.lambda$getAsync$1(DefaultConnectionPool.java:231)
\tat com.mongodb.internal.connection.DefaultConnectionPool$Task.doComplete(DefaultConnectionPool.java:1385)
\tat com.mongodb.internal.connection.DefaultConnectionPool$Task.execute(DefaultConnectionPool.java:1371)
\tat com.mongodb.internal.connection.DefaultConnectionPool$AsyncWorkManager.workerRun(DefaultConnectionPool.java:1311)
\tat com.mongodb.internal.connection.DefaultConnectionPool$AsyncWorkManager.runAndLogUncaught(DefaultConnectionPool.java:1342)
\tat com.mongodb.internal.connection.DefaultConnectionPool$AsyncWorkManager.lambda$initUnlessClosed$0(DefaultConnectionPool.java:1275)
\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
\tat java.base/java.util.concurrent.FutureTask.run(Unknown Source)
\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
\tat java.base/java.lang.Thread.run(Unknown Source)
","correlation-id":"","source-ms":"","user-id":"","user-role":"","additional-info":""}
{"time":"2022-10-12T04:16:41,768Z","level":"INFO","thread":"https-jsse-nio-8080-exec-2","logger":"com.radisys.macs.cems.common.mdclogin.LoggerInterceptor","message":"thread name for MDC Task pre handle: https-jsse-nio-8080-exec-2","context":"default","correlation-id":"","source-ms":"","user-id":"","user-role":"","additional-info":""}
org.springframework.boot spring-boot-starter-data-mongodb-reactive 2.7.0

Getting this error after application run 30-40 minutes then it restarts and application starts working

christophstrobl commented 2 years ago

This looks like a driver issue. @jyemin what do you think?

jyemin commented 2 years ago

Yes, this is driver-related, but it's not clear what the effect is on the application. The stack trace in the description is from an exception thrown from a connection pool-managed background thread and logged by the framework . @sunnycalsoft are operations also failing in your application, and if so can you post any stack traces you have from those threads?

sunnycalsoft commented 2 years ago

@jyemin yes the problem is that even health check failed for mongodb in the application and application gets restarted. We have provided 200 max thread pool and 100 min thread pool as well should be goood enough for the application.

{"time":"2022-10-12T19:11:29,701Z","level":"ERROR","thread":"SNOW-SCHEDULER2","logger":"com.radisys.macs.cems.configmgmt.topology.statemachine.scheduler.TopologyNodeStateWatcher","message":"Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb.svc.cluster.local:27017.; nested exception is com.mongodb.MongoTimeoutException: Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb.svc.cluster.local:27017.","context":"default","exception":"com.google.common.flogger.LogSiteStackTrace: SMALL\n\tat com.radisys.macs.cems.configmgmt.topology.statemachine.scheduler.TopologyNodeStateWatcher.raiseDiscoveryEventIfLockedForMaxTime(TopologyNodeStateWatcher.java:146)\n\tat jdk.internal.reflect.GeneratedMethodAccessor334.invoke(Unknown Source)\n\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)\n\tat java.base/java.lang.reflect.Method.invoke(Unknown Source)\n\tat org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)\n\tat org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)\n\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)\n\tat java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)\n\tat java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\nCaused by: org.springframework.dao.DataAccessResourceFailureException: Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb.svc.cluster.local:27017.; nested exception is com.mongodb.MongoTimeoutException: Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb.svc.cluster.local:27017.\n\tat org.springframework.data.mongodb.core.MongoExceptionTranslator.translateExceptionIfPossible(MongoExceptionTranslator.java:95)\n\tat org.springframework.data.mongodb.core.ReactiveMongoTemplate.potentiallyConvertRuntimeException(ReactiveMongoTemplate.java:2954)\n\tat org.springframework.data.mongodb.core.ReactiveMongoTemplate.lambda$translateException$89(ReactiveMongoTemplate.java:2937)\n\tat reactor.core.publisher.Mono.lambda$onErrorMap$31(Mono.java:3733)\n\tat reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:94)\n\tat reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:192)\n\tat reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:259)\n\tat reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:93)\n\tat reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:192)\n\tat reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:259)\n\tat reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onError(MonoPeekTerminal.java:258)\n\tat reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201)\n\tat com.mongodb.reactivestreams.client.internal.MongoOperationPublisher.lambda$sinkToCallback$30(MongoOperationPublisher.java:545)\n\tat com.mongodb.reactivestreams.client.internal.OperationExecutorImpl.lambda$execute$2(OperationExecutorImpl.java:92)\n\tat com.mongodb.internal.operation.CountDocumentsOperation.lambda$executeAsync$1(CountDocumentsOperation.java:139)\n\tat com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:48)\n\tat com.mongodb.internal.async.function.AsyncCallbackSupplier.lambda$whenComplete$1(AsyncCallbackSupplier.java:95)\n\tat com.mongodb.internal.async.function.RetryingAsyncCallbackSupplier$RetryingCallback.onResult(RetryingAsyncCallbackSupplier.java:109)\n\tat com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:48)\n\tat com.mongodb.internal.async.function.AsyncCallbackSupplier.lambda$whenComplete$1(AsyncCallbackSupplier.java:95)\n\tat com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:48)\n\tat com.mongodb.internal.operation.OperationHelper.lambda$withAsyncSuppliedResource$7(OperationHelper.java:649)\n\tat com.mongodb.internal.connection.DefaultServer$1.onResult(DefaultServer.java:130)\n\tat com.mongodb.internal.connection.DefaultServer$1.onResult(DefaultServer.java:122)\n\tat com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:48)\n\tat com.mongodb.internal.connection.DefaultConnectionPool.lambda$getAsync$0(DefaultConnectionPool.java:200)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$OpenConcurrencyLimiter.openAsyncWithConcurrencyLimit(DefaultConnectionPool.java:944)\n\tat com.mongodb.internal.connection.DefaultConnectionPool.lambda$getAsync$1(DefaultConnectionPool.java:231)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$Task.doComplete(DefaultConnectionPool.java:1385)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$Task.execute(DefaultConnectionPool.java:1371)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$AsyncWorkManager.workerRun(DefaultConnectionPool.java:1311)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$AsyncWorkManager.runAndLogUncaught(DefaultConnectionPool.java:1342)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$AsyncWorkManager.lambda$initUnlessClosed$0(DefaultConnectionPool.java:1275)\n\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)\n\tat java.base/java.util.concurrent.FutureTask.run(Unknown Source)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\n\tat java.base/java.lang.Thread.run(Unknown Source)\n\tSuppressed: java.lang.Exception: #block terminated with an error\n\t\tat reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)\n\t\tat reactor.core.publisher.Mono.block(Mono.java:1707)\n\t\tat com.radisys.macs.cems.configmgmt.topology.statemachine.repository.TopologyDaoRepositoryImpl.getTopologyCount(TopologyDaoRepositoryImpl.java:346)\n\t\tat com.radisys.macs.cems.service.PersistenceServiceImpl.getTopologyCount(PersistenceServiceImpl.java:387)\n\t\tat com.radisys.macs.cems.configmgmt.topology.statemachine.scheduler.TopologyNodeStateWatcher.getAllTopologyNodeCountWithDiscoveringState(TopologyNodeStateWatcher.java:159)\n\t\tat com.radisys.macs.cems.configmgmt.topology.statemachine.scheduler.TopologyNodeStateWatcher.raiseDiscoveryEventIfLockedForMaxTime(TopologyNodeStateWatcher.java:134)\n\t\tat jdk.internal.reflect.GeneratedMethodAccessor334.invoke(Unknown Source)\n\t\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)\n\t\tat java.base/java.lang.reflect.Method.invoke(Unknown Source)\n\t\tat org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)\n\t\tat org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)\n\t\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)\n\t\tat java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)\n\t\tat java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)\n\t\t... 3 common frames omitted\nCaused by: com.mongodb.MongoTimeoutException: Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb.svc.cluster.local:27017.\n\tat com.mongodb.internal.connection.DefaultConnectionPool.createTimeoutException(DefaultConnectionPool.java:330)\n\tat com.mongodb.internal.connection.DefaultConnectionPool.access$2000(DefaultConnectionPool.java:106)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$OpenConcurrencyLimiter.acquirePermitOrGetAvailableOpenedConnection(DefaultConnectionPool.java:1003)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$OpenConcurrencyLimiter.openAsyncWithConcurrencyLimit(DefaultConnectionPool.java:941)\n\t... 11 common frames omitted\n","correlation-id":"da10fd6c-4a57-11ed-9fbc-8376975cee80","source-ms":"","user-id":"","user-role":"","additional-info":""}

stIncMale commented 2 years ago

@sunnycalsoft The presented com.mongodb.MongoTimeoutExceptions originating from DefaultConnectionPool don't by themselves signify a driver problem. They happen when the pool is asked for a connection, but does not have enough available non-perished connections. When this happens, the pool tries to create and establish a new connection. The former is constrained by pool's [max size](https://mongodb.github.io/mongo-java-driver/4.7/apidocs/mongodb-driver-core/com/mongodb/connection/ConnectionPoolSettings.html#getMaxSize()), the latter is constrained by pool's [max connecting limit](https://mongodb.github.io/mongo-java-driver/4.7/apidocs/mongodb-driver-core/com/mongodb/connection/ConnectionPoolSettings.html#getMaxConnecting()). The stack traces have the acquirePermitOrGetAvailableOpenedConnection frame, which tells that the activity of checking out a connection has passed the max size barrier (though it is unknown how much of the checkout timeout it spent there, maybe most of it), but failed to pass the max connecting barrier within the rest of the timeout (passing this barrier happens either because the pool allows the checkout request to establish a connection, or because a different connection was checked in the pool and was not checked out by a different request that had also been waiting on this barrier).

I can think of two situations that may cause the exception:

  1. The pool is actually working at / close to its capacity (max size), and connections stay checked for long a long time because MongoDB commands they are used for are time-consuming. You may see pool's checkedOutCount, size, maxSize via JMX as described here. If the pool actually operates at / close to its capacity, you may try to increase the max size.
  2. The driver clears the pool at some point, and when the driver decides to use the pool again, not all checkout requests that flood the pool can be satisfied in a timely manner given the configured max connecting limit. You may try and increase this limit.

Hopefully the above will help. If it won't, and your company has support from MongoDB, then, of course, you may want to create a support case.

mp911de commented 1 year ago

This ticket relates to MongoDB driver internals and nothing that we can handle from the Spring Data side.