If elasticsearch service is not running when mongo-connector fetches new data from mongodb then:
A warning is printed to mongo-connector.log:
2018-02-01 09:03:25,524 [WARNING] elasticsearch:97 - GET http://localhost:9200/_mget?realtime=true [status:N/A request:0.000s]
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_urllib3.py", line 147, in perform_request
response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw)
File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/local/lib/python2.7/dist-packages/urllib3/util/retry.py", line 333, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 357, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python2.7/httplib.py", line 1057, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.7/httplib.py", line 1097, in _send_request
self.endheaders(body)
File "/usr/lib/python2.7/httplib.py", line 1053, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 897, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 859, in send
self.connect()
File "/usr/local/lib/python2.7/dist-packages/urllib3/connection.py", line 166, in connect
conn = self._new_conn()
File "/usr/local/lib/python2.7/dist-packages/urllib3/connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f19cc4b1f10>: Failed to establish a new connection: [Errno 111] Connection refused
mongo-connector will not exit and will not retry to reconnect to elasticsearch after some time, but instead would run in a broken state. Effectively replication stops until mongo-connector is restarted.
new timestamp would be written to oplog.timestamp as if the data was successfully replicated.
after restart of mongo-connector (if elasticsearch service gets online again) mongo-connector would skip the data it failed to write before, because oplog.timestamp already points to a newer timestamp.
A desired behavior could be either exiting mongo-connector without updating oplog.timestamp or retrying the connection to elasticsearch after some predefined intervals.
This problem may appear for example in situations when both elasticsearch and mongo-connector are started during system startup and elasticsearch is not ready yet to receive connection when mongo-connector already fetched some data from mongodb. As a workaround it is possible to add a delay to mongo-connector startup, but that would not help in a situation when elasticsearch service is restarting.
I was thinking the same on how the Elasticsearch failure scenario is addressed and would expect some sort of parameter where it can resync with the data on a couple of retries
If elasticsearch service is not running when mongo-connector fetches new data from mongodb then:
A warning is printed to mongo-connector.log:
mongo-connector will not exit and will not retry to reconnect to elasticsearch after some time, but instead would run in a broken state. Effectively replication stops until mongo-connector is restarted.
new timestamp would be written to oplog.timestamp as if the data was successfully replicated.
after restart of mongo-connector (if elasticsearch service gets online again) mongo-connector would skip the data it failed to write before, because oplog.timestamp already points to a newer timestamp.
A desired behavior could be either exiting mongo-connector without updating oplog.timestamp or retrying the connection to elasticsearch after some predefined intervals.
This problem may appear for example in situations when both elasticsearch and mongo-connector are started during system startup and elasticsearch is not ready yet to receive connection when mongo-connector already fetched some data from mongodb. As a workaround it is possible to add a delay to mongo-connector startup, but that would not help in a situation when elasticsearch service is restarting.
Used software:
Python (2.7.12): mongo-connector (2.5.1) pymongo (3.6.0) elasticsearch (6.0.0) elastic2-doc-manager (0.3.0)
Elasticsearch 6.0.0 MongoDB 3.4.10 Ubuntu 16.04
config.json: