Closed hbarcelos closed 6 years ago
Have you tried removing new relic? New Relic tends to do some very ugly monkey-patching and in my experience the first step to debugging issues like this is to remove NR.
We experienced a very similar situation after updating to Mongoose 4.12.3
. We only see this issue in production as well. The issue is sporadic and is not reproducible by any reliable methods.
Versions where the error is happening:
mongodb: "2.2.33"
mongoose: "4.12.3"
Same error/stack trace as @hbarcelos
MongoError: no connection available
at nextFunction (/var/www/app/node_modules/mongodb-core/lib/cursor.js:540:25)
at Cursor.next [as _next] (/var/www/app/node_modules/mongodb-core/lib/cursor.js:699:3)
at nextObject (/var/www/app/node_modules/mongodb/lib/cursor.js:680:8)
at Cursor.next (/var/www/app/node_modules/mongodb/lib/cursor.js:270:12)
at QueryStream.__next (/var/www/app/node_modules/mongoose/lib/querystream.js:185:17)
at QueryStream._next (/var/www/app/node_modules/mongoose/lib/querystream.js:165:15)
at /var/www/app/node_modules/mongoose/lib/querystream.js:137:11
at handleCallback (/var/www/app/node_modules/mongodb/lib/utils.js:120:56)
at Collection.find (/var/www/app/node_modules/mongodb/lib/collection.js:384:44)
at NativeCollection.(anonymous function) [as find] (/var/www/app/node_modules/mongoose/lib/drivers/node-mongodb-native/collection.js:131:28)
at QueryStream._init (/var/www/app/node_modules/mongoose/lib/querystream.js:132:20)
at /var/www/app/node_modules/mongoose/lib/querystream.js:74:11
at _combinedTickCallback (internal/process/next_tick.js:73:7)
at process._tickDomainCallback (internal/process/next_tick.js:128:9)
Not using NewRelic here.
@vkarpov15 this happens in an application without NewRelic as well.
Also, we have a limit of 6k connections to our production Mongo deployment and, as we don't change the default pool size of 5
, those 200 servers create up to ~1k connections.
We receive alerts if the connection count increases beyond 3k, but we haven't received any of those within the past few weeks, so hitting Mongo limits doesn't seem to be the issue.
Is it possible that such message is raised when all the 5 connections are busy?
Not sure, I'll have to investigate. Just to be 100% clear, are you disabling the autoReconnect
option? I don't see it in your connection options but that might be one possible explanation.
I'm not setting any values for this, so it's using the default value.
Em sex, 10 de nov de 2017 04:08, Valeri Karpov notifications@github.com escreveu:
Not sure, I'll have to investigate. Just to be 100% clear, are you disabling the autoReconnect option? I don't see it in your connection options but that might be one possible explanation.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Automattic/mongoose/issues/5794#issuecomment-343383504, or mute the thread https://github.com/notifications/unsubscribe-auth/APzFYpYqjUlA95n72XIpjed2EVi3QIMRks5s0-hFgaJpZM4QTL8_ .
--
Henrique
We're sporadically seeing the same error / stack trace
Mongo: 3.4.10 Mongoose: 4.12.3
Stack: MongoError: no connection available
at nextFunction (/var/task/node_modules/mongodb-core/lib/cursor.js:540:25)
at Cursor.next [as _next] (/var/task/node_modules/mongodb-core/lib/cursor.js:699:3)
at nextObject (/var/task/node_modules/mongodb/lib/cursor.js:680:8)
at Cursor.next (/var/task/node_modules/mongodb/lib/cursor.js:270:12)
at findOne (/var/task/node_modules/mongodb/lib/collection.js:1414:10)
at Collection.findOne (/var/task/node_modules/mongodb/lib/collection.js:1400:44)
at NativeCollection.(anonymous function) [as findOne] (/var/task/node_modules/mongoose/lib/drivers/node-mongodb-native/collection.js:131:28)
at NodeCollection.findOne (/var/task/node_modules/mquery/lib/collection/node.js:38:19)
at Query.findOne (/var/task/node_modules/mquery/lib/mquery.js:1816:20)
at Query._findOne (/var/task/node_modules/mongoose/lib/query.js:1425:22)
at /var/task/node_modules/kareem/index.js:250:8
at /var/task/node_modules/kareem/index.js:23:7
at _combinedTickCallback (internal/process/next_tick.js:73:7)
at process._tickDomainCallback (internal/process/next_tick.js:128:9)
Our mongodb is hosted at https://cloud.mongodb.com as a 3 node replica set, and reports in the logs having at most around 30 open connections at any one time. The application running mongoose is using the serverless framework and running on AWS Lambda. Not using new relic.
We're unable to reliably reproduce, and the service is relatively low traffic, so it's difficult to determine if the problem is a coding error on our side or a problem with the library. In the short term we're going to downgrade to 4.11.14 and see if the error continues or not.
@wtfleming can you show me your connection logic, with urls, passwords, etc. X-ed out? Your stack trace makes it seem like you're turning autoReconnect
off, the error you're hitting should only be triggered if autoReconnect
is disabled...
I think I managed to track this down to an issue in the mongodb driver :point_up: , will have to wait for that to get merged before I can bump this fix into mongoose.
@vkarpov15 I've tested your fix for mongodb-core and now the connection is stable. Thanks a lot. Just waiting for the new release.
Fixed in the 4.x branch. Thanks for your patience, the mongodb node driver team was swamped trying to release v3 of the driver to support mongodb 3.6, but they got a new release out. Will ship 4.13.9 shortly :+1:
@vkarpov15 The fix is only present in mongodb-core 2.x. The new 3.x releases miss the fix :(
More than a year later, but I I am also getting a similar issue in production, whereby I am using MongoDB managed by Mongo Atlas cloud provider. The mongoose version used is: 5.2.2. The following is the stack trace if it might help:
MongoError: no connection available for operation and number of stored operation > 0
at Function.create (/var/www/my-production-app/node_modules/mongodb-core/lib/error.js:43:12)
at Store.add (/var/www/my-production-app/node_modules/mongodb/lib/topologies/topology_base.js:38:18)
at executeWriteOperation (/var/www/my-production-app/node_modules/mongodb-core/lib/topologies/replset.js:1147:39)
at ReplSet.insert (/var/www/my-production-app/node_modules/mongodb-core/lib/topologies/replset.js:1201:3)
at ReplSet.insert (/var/www/my-production-app/node_modules/mongodb/lib/topologies/topology_base.js:321:25)
at insertDocuments (/var/www/my-production-app/node_modules/mongodb/lib/operations/collection_ops.js:853:19)
at insertOne (/var/www/my-production-app/node_modules/mongodb/lib/operations/collection_ops.js:883:3)
at executeOperation (/var/www/my-production-app/node_modules/mongodb/lib/utils.js:416:24)
at Collection.insertOne (/var/www/my-production-app/node_modules/mongodb/lib/collection.js:463:10)
at NativeCollection.(anonymous function) [as insertOne] (/var/www/my-production-app/node_modules/mongoose/lib/drivers/node-mongodb-native/collection.js:158:28)
at MyMongooseSchmeaModel.Model.$__handleSave (/var/www/my-production-app/node_modules/mongoose/lib/model.js:266:33)
at MyMongooseSchmeaModel.Model.$__save (/var/www/my-production-app/node_modules/mongoose/lib/model.js:329:8)
at /var/www/my-production-app/node_modules/kareem/index.js:278:20
at _next (/var/www/my-production-app/node_modules/kareem/index.js:102:16)
at process.nextTick (/var/www/my-production-app/node_modules/kareem/index.js:499:38)
at _combinedTickCallback (internal/process/next_tick.js:132:7)
at process._tickCallback (internal/process/next_tick.js:181:9)
@mdemblani please open up a separate issue with code samples, this stack trace isn't enough to be actionable
Here are the
options
we pass down to mongoose:The issue happens mostly in production, where ~200 servers connect to a single replica set MongoDB deployment hosted by MongoDB itself, within the same AWS region our servers are.
The versions in which we observed the behavior:
3.4.10
4.12.3
Then we made a downgrade to
mongoose@4.11.x
, which installed version4.11.14
in our applications. Doing so made such errors stop.Looking for something in the logs led us to some stack traces like the following:
I'm sorry I can't give you any further information. Because this issue is intermittent, is hard for us to catch the exact moment it'll happen to try to debug it.