Open lifuchao opened 7 years ago
What about the elasticsearch doc-manager? Did you manually updated him to use the bulk api? The number of 1000 sounds suspicious. Then my old "problem" here could be the reason: You have to set the autoCommitInterval.
@lifuchao we just released mongo-connector 2.5.0 and elastic2-doc-manager 0.3.0. Would you be able to upgrade to the latest version and check back if the issue is still present or not?
To upgrade mongo-connector and the elastic2-doc-manager:
pip install --upgrade 'mongo-connector[elastic2]'
@ShaneHarvey I tried to update as you say ,but failed. the error info in pip.log file like this: Ignoring link https://pypi.python.org/packages/ed/5f/c5b60c72c08773d60b83d8255a4e1b73d3ff9eeece780e5f22be7dbc1c67/pymongo-0.14.tar.gz#md5=96c7b066815445e75ad095c0fa760eab (from https://pypi.python.org/simple/pymongo/), version 0.14 doesn't match >=2.9 Ignoring link https://pypi.python.org/packages/ef/2e/d05c3d2e244d26f65a71bec20b6080c54cfbd97eaa9d6c358dcfbea62425/pymongo-0.5.3pre.tar.gz#md5=4c09638b71b3590f82b9f8529689bdb8 (from https://pypi.python.org/simple/pymongo/), version 0.5.3pre doesn't match >=2.9 Ignoring link https://pypi.python.org/packages/fe/6c/5cf65618ee2248e264c1825395b16b1a0f3e96349d340db4af04c386ea8c/pymongo-2.3.tar.gz#md5=0d342ad1506f983af671d0b0e0e1efec (from https://pypi.python.org/simple/pymongo/), version 2.3 doesn't match >=2.9 Using version 3.4.0 (newest of versions: 3.4.0, 3.3.1, 3.3.0, 3.3.0, 3.2.2, 3.2.1, 3.2, 3.1.1, 3.1, 3.0.3, 3.0.2, 3.0.1, 3.0, 2.9.4, 2.9.3, 2.9.2, 2.9.1, 2.9) Downloading/unpacking pymongo>=2.9 from https://pypi.python.org/packages/82/26/f45f95841de5164c48e2e03aff7f0702e22cef2336238d212d8f93e91ea8/pymongo-3.4.0.tar.gz#md5=aa77f88e51e281c9f328cea701bb6f3e (from mongo-connector[elastic2]) Downloading from URL https://pypi.python.org/packages/82/26/f45f95841de5164c48e2e03aff7f0702e22cef2336238d212d8f93e91ea8/pymongo-3.4.0.tar.gz#md5=aa77f88e51e281c9f328cea701bb6f3e Cleaning up... Removing temporary dir /tmp/pip_build_root... Exception: Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 122, in main status = self.run(options, args) File "/usr/lib/python2.7/dist-packages/pip/commands/install.py", line 278, in run requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bundle=self.bundle) File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1198, in prepare_files do_download, File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1376, in unpack_url self.session, File "/usr/lib/python2.7/dist-packages/pip/download.py", line 572, in unpack_http_url download_hash = _download_url(resp, link, temp_location) File "/usr/lib/python2.7/dist-packages/pip/download.py", line 433, in _download_url for chunk in resp_read(4096): File "/usr/lib/python2.7/dist-packages/pip/download.py", line 421, in resp_read chunk_size, decode_content=False): File "/usr/share/python-wheels/urllib3-1.7.1-py2.py3-none-any.whl/urllib3/response.py", line 225, in stream data = self.read(amt=amt, decode_content=decode_content) File "/usr/share/python-wheels/urllib3-1.7.1-py2.py3-none-any.whl/urllib3/response.py", line 174, in read data = self._fp.read(amt) File "/usr/lib/python2.7/httplib.py", line 573, in read s = self.fp.read(amt) File "/usr/lib/python2.7/socket.py", line 380, in read data = self._sock.recv(left) File "/usr/lib/python2.7/ssl.py", line 341, in recv return self.read(buflen) File "/usr/lib/python2.7/ssl.py", line 260, in read return self._sslobj.read(len) SSLError: The read operation timed out
@ShaneHarvey I have updated successfully like this : sudo pip install --upgrade 'mongo-connector[elastic2]' --default-timeout=600
but whether this will solve the problem of data update syn,I will spotcheck these days and report later.
thanks a lot.
@ShaneHarvey after I upgraded mongo-connector to V2.5.0 ,I spotcheck the data of mongo and elasticsearch, there're still some data can not sync (about 10%).I want to know how mongo-connector sync update,is there any references? my es index has doc about 0.1 billion.and every day will have data update, how could I assure the data sync update successfully? hope for your help,thanks.
Can you post the steps to reproduce this issue (sample MongoDB data, sample updates that trigger the missing updates, and mongo-connector config file)? Otherwise, there's no way for me to find out if/where this is a bug in mongo-connector.
elaticsearch:v2.3.4 mongodb:v3.2.9 mongo-connector:v2.4.1
the question is below: I use mongo-connector syn data from mongodb to elasticsearch,the collection has about 80,000,000 docs, initially sync is ok,but my data in mongodb will update,and I found some data could not sync update to elasticsearch.I spotcheck 1000 docs,and maybe about 200 docs did not sync. e.g: field "key1" value before is "aaaa", after is "bbbb",but the data in elasticsearch is still "aaaa",and there's no error in mongo-connector.log. and how does mongo-connector sync update ?how does it ensure data will sync date?
thanks in advance!