mongo-dart / mongo_dart

Mongo_dart: MongoDB driver for Dart programming language
https://pub.dev/packages/mongo_dart
MIT License
445 stars 98 forks source link

misleading error message when connection reset due to atlas free + shared cluster limitations #228

Open emragins opened 3 years ago

emragins commented 3 years ago

When watching a mongo atlas solution, the free and shared clusters (M0, M2, M5) do not accept connections which don't timeout. This can be viewed on the mongo limitations page: https://docs.atlas.mongodb.com/reference/free-shared-limitations/#operational-limitations, then scroll down to "Cursors".

The way this manifests in mongo_dart is 1) a connection is established correctly 2) when setting up a watch, the application will received changes correctly 3) after a period of time (likely 60 seconds, though I haven't timed it), the connection dies and the log shows the noSecureRequestError from https://github.com/mongo-dart/mongo_dart/blob/main/lib/src/network/connection.dart.

ConnectionException (MongoDB ConnectionException: connection closed: The socket connection has been reset by peer.
Possible causes:
- Trying to connect to an ssl/tls encrypted database without specifiyng
  either the query parm tls=true or the secure=true parameter in db.open()
- The server requires a key certificate from the client, but no certificate has been sent
- Others)

This message is misleading because the request was established correctly, but it was terminated by the host for other reasons. I wasn't able to figure out what sort of feedback is available -- it doesn't appear as if the socket treats it as an error condition, but just finishes the stream. When this happens, the onDone code gets executed, which results in the above error message (https://github.com/mongo-dart/mongo_dart/blob/a6825e72cd37e551fc49fe0360e74378b286d6b7/lib/src/network/connection.dart#L167)

I'm not sure what the best solution here actually is. One thought -- likely the easiest -- is to just add a line to the error message saying that an Atlas M0/M2/M5 tier cluster will terminate the connection. I don't know whether it's possible to put something in serverCapabilities for better insight or hints into why it's failing.

For my use-case, I'm either going to end up upgrading the cluster tier or just re-connect on failure. Maybe a combination.

As/if I learn more in my future experiments around this, I'll update the issue.

PS Thank for this library!

giorgiofran commented 3 years ago

Thanks for your feedback. The onDone: callback is called is some strange cases. Both those two cases that were listed happened to me, so I could register them. I saw that the callback gets executed when, after a successful basic connection, there is some parameter that is not correct, like in case of missing --tls flag. Let me know if you discover more details about this issue. In general it is not clear to me why, in this case, the entire connection is closed and not simply the cursor. When l have some time I'll try to reproduce it.

alexobviously commented 3 years ago

Hey guys, I just want to add that I've been having this error a lot too, and until I found this issue I also had no idea why it was (tbh I'm still not totally sure it's this). I have an application that works fine when I'm testing it alone, but when ~5 people start using it, I get this error and my mongo connections start getting refused.

Is there any workaround? Also will upgrading my atlas plan to dedicated sort this problem if this is what I'm experiencing?

marcellocamara commented 1 year ago

@alexobviously

Is there any workaround? Also will upgrading my atlas plan to dedicated sort this problem if this is what I'm experiencing?

Did you find any workaround ? I'm facing this issue too Backend working for the past 10 days, and now is refusing connection.

CoocooFroggy commented 1 year ago

I am not even sure if I am having the same problem—for me, it seems that using Db#findOne causes this error after exactly 20 reads. My workaround was to instead use Db#find(...).toList().first.

I also added a await _ensureConnection(); at the start of all my Mongo methods:

/// Run this before every database attempt.
static Future<void> _ensureConnection() async {
  if (!_db.isConnected) {
    print('MongoDB disconnected—reconnecting...');
    await _db.close();
    await _db.open();
    print('MongoDB reconnected');
  }
}
alexobviously commented 1 year ago

@alexobviously

Is there any workaround? Also will upgrading my atlas plan to dedicated sort this problem if this is what I'm experiencing?

Did you find any workaround ? I'm facing this issue too Backend working for the past 10 days, and now is refusing connection.

Well this was a while ago but I started doing something like this:

Future<void> get connected async {
    while (db.state == State.OPENING) {
      await Future.delayed(Duration(milliseconds: 100));
    }
    if (db.isConnected) return;
    await db.close();
    await db.open();
    return;
  }

And await connected;before every call. But to be honest, it was long enough ago that I don't actually remember if it totally worked. In the end I switched to digitalocean's dev database and didn't have any issues there.

61soldiers commented 7 months ago

Does anyone know if the serverless tier in mongodb cloud resolves this issue ?