Open JPacks opened 7 years ago
Looks like you are hitting a ReadTimeoutError on Elastic. Try increasing the timeout using a config file such as:
{
"mainAddress": "localhost:27017",
"verbosity": 3,
"namespaces": {
"include": ["a.check"]
},
"docManagers": [
{
"docManager": "elastic_doc_manager",
"targetURL": "localhost:9200",
"autoCommitInterval": 0,
"args": {
"clientOptions": {"timeout": 30}
}
}
]
}
You also can use the continueOnError option to force mongo-connector to log and ignore errors during the collection dump.
I'm also running in this error suddenly, when doing a resync. It worked for a long time.
2017-01-19 12:43:52,690 [CRITICAL] mongo_connector.oplog_manager:666 - Exception during collection dump
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/mongo_connector/oplog_manager.py", line 621, in do_dump
upsert_all(dm)
File "/usr/local/lib/python2.7/dist-packages/mongo_connector/oplog_manager.py", line 607, in upsert_all
mapped_ns, long_ts)
File "/usr/local/lib/python2.7/dist-packages/mongo_connector/util.py", line 44, in wrapped
reraise(new_type, exc_value, exc_tb)
File "/usr/local/lib/python2.7/dist-packages/mongo_connector/util.py", line 33, in wrapped
return f(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/elastic2_doc_manager.py", line 367, in bulk_upsert
for ok, resp in responses:
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 162, in streaming_bulk
for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 91, in _process_bulk_chunk
raise e
ConnectionFailed: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host=u'localhost', port=9200): Read timed out. (read timeout=60))
2017-01-19 12:43:52,703 [ERROR] mongo_connector.oplog_manager:674 - OplogThread: Failed during dump collection cannot recover! Collection(Database(MongoClient(host=[u'localhost:27017'], document_class=dict, tz_aware=False, connect=True, replicaset=u'singleNodeRepl'), u'local'), u'oplog.rs')
2017-01-19 12:43:53,241 [ERROR] __main__:357 - MongoConnector: OplogThread <OplogThread(Thread-3, started 140353541756672)> unexpectedly stopped! Shutting down
I'm using mongo-connector version 2.5.0, pymongo version 3.4.0, MongoDB version 3.2.10 and elastic2_doc_manager version 0.3.0. I'm storing with this setup more than 100M documents.
I already raised the timeout to 60 like you can see in the log.
Previously, the following error appeared already so that I had to start the resync:
2017-01-19 08:58:36,553 [ERROR] mongo_connector.doc_managers.elastic2_doc_manager:412 - Exception while commiting to Elasticsearch
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/elastic2_doc_manager.py", line 406, in commit
successes, errors = bulk(self.elastic, action_buffer)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 190, in bulk
for ok, item in streaming_bulk(client, actions, **kwargs):
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 162, in streaming_bulk
for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 91, in _process_bulk_chunk
raise e
ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host=u'localhost', port=9200): Read timed out. (read timeout=10))
Don't know if this affects the newest error. Should I just set continueOnError? Are documents ignored (=>not synced), when an error appears and this option is set?
With continueOnError
, documents that fail to sync during the collection dump period will be ignored. The general problem is that the Elasticsearch doc managers do not retry on connection/operation failure, see https://github.com/mongodb-labs/elastic2-doc-manager/issues/18.
For now, I can only recommend increasing the Elasticsearch client timeout again. Do you see any errors or warnings in the Elasticsearch logs?
I am trying to sync mongodb replica to elasticsearch using mongo-connector. It works fine when I insert the first doc in my collection "check". But getting "Failed during dump collection cannot recover" error in mongo-connector.log during the second doc insertion. Due to this error, the second doc is getting loaded into an elasticsearch index.
The command I used is: To start Mongo replica: sudo mongod --port 27017 --dbpath /_/_//**\ --replSet rs0 To start Mongo Connector:** mongo-connector -m localhost:27017 -t localhost:9200 -d elastic_doc_manager --auto-commit-interval=0 -n a.check
Mongo-connector.log : 2016-10-13 17:27:45,381 [CRITICAL] mongo_connector.oplog_manager:630 - Exception during collection dump Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/mongo_connector/oplog_manager.py", line 583, in do_dump upsert_all(dm) File "/usr/local/lib/python2.7/dist-packages/mongo_connector/oplog_manager.py", line 567, in upsert_all dm.bulk_upsert(docs_to_dump(namespace), mapped_ns, long_ts) File "/usr/local/lib/python2.7/dist-packages/mongo_connector/util.py", line 43, in wrapped reraise(new_type, exc_value, exc_tb) File "/usr/local/lib/python2.7/dist-packages/mongo_connector/util.py", line 32, in wrapped return f(_args, *_kwargs) File "/usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/elastic_doc_manager.py", line 214, in bulk_upsert for ok, resp in responses: File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/init.py", line 160, in streaming_bulk for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs): File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/init.py", line 89, in _process_bulk_chunk raise e ConnectionFailed: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host=u'localhost', port=9200): Read timed out. (read timeout=10)) 2016-10-13 17:27:45,381 [ERROR] mongo_connector.oplog_manager:638 - OplogThread: Failed during dump collection cannot recover! Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, replicaset=u'rs0'), u'local'), u'oplog.rs') 2016-10-13 17:27:46,376 [ERROR] mongo_connector.connector:304 - MongoConnector: OplogThread <OplogThread(Thread-2, started 140648179619584)> unexpectedly stopped! Shutting down
FYI, I am using elasticsearch 2.3.1 ,mongodb 3.0.12 and mongo-connector 2.4.1