yougov / mongo-connector

MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)
Apache License 2.0
1.88k stars 479 forks source link

`ServerSelectionTimoutError` on Remote Replica Set #390

Closed letsgolesco closed 8 years ago

letsgolesco commented 8 years ago

I'm trying to connect from a remote MongoDB instance to a local Elasticsearch instance. The MongoDB instance is a replica set behind a proxy, and I'm connecting with SSL. I've tried tweaking things with varying degrees of success, the closest I've gotten is when I hit this error:

ServerSelectionTimeoutError: <ip_address_1>:27017: timed out,<ip_address_2>:27017: timed out,<ip_address_3>:27017: timed out

The 3 IP addresses it lists are correct for the replica set members, which indicates to me that it's actually connecting/authenticating properly, but some issue occurs afterward.

The command I'm using looks like this: mongo-connector -c mongo-connector-config.json -m mongodb://<ip_address>:<port>/?ssl=true -t 192.168.99.100:9200 -d elastic_doc_manager --ssl-ca-certs my-cert.pem --ssl-certificate-policy optional

The only things I'm specifying in my config are noDump: true and a few namespaces.

behackett commented 8 years ago

I see you are using --ssl-certificate-policy optional. Is your replica set actually running with TLS enabled? If not, can you connect to it without the ssl options? I'd like to know if this is an SSL/TLS problem, or a problem connecting through your proxy.

letsgolesco commented 8 years ago

We're running MongoDB hosted on Compose. Our instance has SSL enabled, but we only have access to a public key.

From terminal, we can connect using the --ssl option and either --sslCaFile <path_to_public_key> or --sslAllowInvalidCertificates. We'd prefer the former over the latter to prevent man in the middle attacks.

behackett commented 8 years ago

I'm not sure I understand why you would need --ssl-ca-certs or --ssl-certificate-policy. If the compose instance you are connecting to uses TLS I would assume the server's cert is signed by a well known certificate authority. If you are using PyMongo 3.0+ ssl=true should be all you need. If this is not the case you may want to ask compose for help.

letsgolesco commented 8 years ago

We're using mongo-connector 2.2 which is using PyMongo 3.2.1 right now. Without --ssl-ca-certs I get this error: ServerSelectionTimeoutError: SSL handshake failed: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590).

Compose doesn't provide a private key for SSL unfortunately, all we have to work with is a public key. Using Mongo's command line or native Node driver, we pass this public key into an option called sslCAFile or sslCA and it works fine - I assumed PyMongo would be similar.

behackett commented 8 years ago

You can configure all of these options using the MongoDB URI, not just ssl=true. See the docs here:

https://api.mongodb.org/python/current/examples/tls.html

letsgolesco commented 8 years ago

The second-last option on that page is pretty much what I was trying to accomplish with my --ssl-ca-certs option:

client = pymongo.MongoClient('example.com',
...                              ssl=True,
...                              ssl_ca_certs='/path/to/ca.pem')

The equivalent works via command line, so I'm not sure what's going on in between the CLI and the actual MongoClient call. I'll try to debug and see.

letsgolesco commented 8 years ago

It's suggested in this stackoverflow thread that passing the arg connect=False to MongoClient could get around this ServerSelectionTimeoutError.

Does mongo-connector support passing this connect argument to PyMongo's MongoClient?

behackett commented 8 years ago

That stackoverflow question is irrelevant for this discussion. mongo-connector doesn't fork any child processes. Your problem appears to be related to TLS.

letsgolesco commented 8 years ago

Using PyMongo's MongoClient alone, I can connect to the database like so:

from pymongo import MongoClient
host = 'mongodb://<user>:<password>@<hostname>:<port>/<database>'
cert = '/path/to/cert'
MongoClient(host, ssl_ca_certs=cert)

Any tips on replicating this with mongo-connector?

letsgolesco commented 8 years ago

The URI we use to connect to the MongoDB server is actually for a Mongos/Haproxy router. It looks like upon connection, mongo-connector detects the replicaSet behind the Mongos router and tries to connect directly to those hosts (which isn't allowed by Compose). This may be the core issue at hand.

behackett commented 8 years ago

That explains the problem. You won't be able to use mongo-connector. It has to be able to connect to the shards directly, since that's where the oplogs live.

letsgolesco commented 8 years ago

I can access the oplog while connected to the Mongos router using pymongo's MongoClient. My use case is technically possible, just not implemented in the connector right now.

llvtt commented 8 years ago

@behackett is correct. mongo-connector needs to be able to connect to each of the shards directly in order to read the local oplog on each replica set. If connecting to the shards directly isn't possible, then mongo-connector cannot function.

letsgolesco commented 8 years ago

I guess I'm requesting a feature then

aherlihy commented 8 years ago

Hello!

Unfortunately the feature you are requesting is not possible. Mongo-connector cannot run without access to the oplog of each shard, and it is not possible to access the local database of shards through the mongos.

How exactly have you been able to access oplog while connected to the Mongos router using pymongo's MongoClient?

letsgolesco commented 8 years ago

@aherlihy my script looks like this (sensitive info redacted):

import ssl
from pymongo import MongoClient
host = 'mongodb://oploguser:password@host-name.com:12345/local?authSource=admin'
cert = '/path/to/cert.pem'
client = MongoClient(host, ssl_ca_certs=cert)
client.admin.command('isdbgrid') # confirms we are connected to a mongos, see: https://docs.mongodb.org/manual/reference/command/isdbgrid/
print client['local']['oplog.rs'].find()
# Proceeds to print documents from the oplog collection
behackett commented 8 years ago

What version of MongoDB are you using? This is what I get from MongoDB 3.0:

pymongo.errors.OperationFailure: database error: can't use 'local' database through mongos

It is not possible to access the oplogs of the shards through mongos.

letsgolesco commented 8 years ago

I'm using Mongo 3.2 with no issues

aherlihy commented 8 years ago

Using MongoDB 3.2.4.

>>> print client['local']['oplog.rs'].find()
<pymongo.cursor.Cursor object at 0x1009d11d0>
>>> print list(client['local']['oplog.rs'].find())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Python/2.7/site-packages/pymongo/cursor.py", line 983, in next
    if len(self.__data) or self._refresh():
  File "/Library/Python/2.7/site-packages/pymongo/cursor.py", line 908, in _refresh
    self.__read_preference))
  File "/Library/Python/2.7/site-packages/pymongo/cursor.py", line 839, in __send_message
    codec_options=self.__codec_options)
  File "/Library/Python/2.7/site-packages/pymongo/helpers.py", line 122, in _unpack_response
    error_object)
pymongo.errors.OperationFailure: database error: can't use 'local' database through mongos
adityasharmacs commented 7 years ago

I am facing the following issue:

Exception in thread Thread-2: Traceback (most recent call last): File "c:\users\abc\appdata\local\programs\python\python36\lib\threading.py", line 916, in _bootstrap_inner self.run() File "c:\users\abc\appdata\local\programs\python\python36\lib\site-packages\mongo_connector\util.py", line 104, in wrap ped func(*args, **kwargs) File "c:\users\abc\appdata\local\programs\python\python36\lib\site-packages\mongo_connector\connector.py", line 347, in run self.main_conn.admin.command('buildInfo')['version']) File "c:\users\abc\appdata\local\programs\python\python36\lib\site-packages\pymongo\database.py", line 491, in command with client._socket_for_reads(read_preference) as (sock_info, slave_ok): File "c:\users\abc\appdata\local\programs\python\python36\lib\contextlib.py", line 82, in enter return next(self.gen) File "c:\users\abc\appdata\local\programs\python\python36\lib\site-packages\pymongo\mongo_client.py", line 859, in _soc ket_for_reads with self._get_socket(read_preference) as sock_info: File "c:\users\abc\appdata\local\programs\python\python36\lib\contextlib.py", line 82, in enter return next(self.gen) File "c:\users\abc\appdata\local\programs\python\python36\lib\site-packages\pymongo\mongo_client.py", line 823, in _get _socket server = self._get_topology().select_server(selector) File "c:\users\abc\appdata\local\programs\python\python36\lib\site-packages\pymongo\topology.py", line 214, in select_s erver address)) File "c:\users\abc\appdata\local\programs\python\python36\lib\site-packages\pymongo\topology.py", line 189, in select_s ervers self._error_message(selector)) pymongo.errors.ServerSelectionTimeoutError: localhost:27017: timed out

I tried to observe using "netsata -a" if I could figure out the ports where there might be issue in communicating, but didn't get any success.

Any help for resolving the issue is highly appreciated.

pratyus commented 7 years ago

Is there now a suggested workaround or a solution to @letsgolesco 's query regarding mongo-connector timing out connecting to replicaSet host directly?

ShaneHarvey commented 7 years ago

The original ticket is asking for mongo-connector to support Compose's Mongos oplog proxy. This proxy looks like a mongos but only proxies a single replica set and is specific to Compose only. It might be possible to support it by checking if the "local.oplog.rs" collection is readable through the mongos connection. If we get pymongo.errors.OperationFailure: database error: can't use 'local' database through mongos then we're connected to a real mongos, otherwise we're connected to Compose's proxy.

There might be other changes required too. For example, can mongo-connector read all collections through the proxy?

@pratyus you are welcome to look into these issues and provide a pull request. The "local.oplog.rs" check shouldn't be hard to implement. See https://github.com/mongodb-labs/mongo-connector/blob/master/mongo_connector/connector.py#L362-L365.

pratyus commented 7 years ago

Thanks @ShaneHarvey. I was able to verify read access for "local.oplog.rs". Also, verified mongo-connector with my compose deployment. Works like a charm.

I have sent #751 to enable this. Let me know what you think, or if I am missing anything, happy to iterate and add appropriate tests.

pratyus commented 7 years ago

Friendly ping to take a look at the PR. Thanks

benlavalley commented 5 years ago

@pratyus Nice job with the fix, implemented it for my app's architecture and it works fine.