Closed cdimitroulas closed 7 years ago
looks like it is getting an error from elasticsearch but fails when trying to gather information to log here.
I'll see if I can't get that logic fixed up today.
Awesome, thanks. Let me know if you need any more information
@cdimitroulas are you able to build transporter locally to test this?
sure, I have go installed so I should be able to do it with some instructions. What are the steps to build transporter after cloning it locally?
great! we've got some instructions here but the gist for testing this specific fix is
git checkout 291-log-panic
go build ./cmd/transporter/...
then you'll have a transporter binary in your current directory, from there you should be able to run:
./transporter run -config=/path/to/transporter.yaml /path/to/application.js
Thanks for that
FYI I am using Mongo Atlas with WiredTiger and Elastic Search v5.2
Just ran it on the 291-log-panic branch and saw this error after 378,875 records:
ERRO[0959] Post https://8a136e9f66cc0ab4bedecf5a14e5ca09.eu-west-1.aws.found.io:9243/_bulk: net/http: request canceled (Client.Timeout exceeded while awaiting headers) executionID=399 path="mongodb/es" version=5 writer=elasticsearch
Looks like I just need to increase the timeout?
yea, looks like that would help, although I wouldn't expect a bulk request to take longer than 30 seconds.
Now the following errors were logged:
ERRO[1487] elastic: Error 502 (Bad Gateway) executionID=403 path="mongodb/es" version=5 writer=elasticsearch
ERRO[1496] elastic: Error 502 (Bad Gateway) executionID=404 path="mongodb/es" version=5 writer=elasticsearch
ERRO[1496] elastic: Error 502 (Bad Gateway) executionID=405 path="mongodb/es" version=5 writer=elasticsearch
It's possible that because I am using Elastic Cloud and the lowest tier of their cluster that it does not have enough resources to deal with the influx of data. Is there any way to reduce the speed at which the transporter makes requests?
we don't have a way to lower the bulk rate right now, feel free to open another issue (or submit a PR) to get it added.
alright, thanks a lot for your help. The error reporting from the elasticsearch writer seems to be working fine though!
👍 I've merged the fix into master and we may try and get a patch release out depending on other bugs reported before the next minor release.
@cdimitroulas How do you define mongodb atlas settings in transport.yaml. Here is my settings but I am getting error
connection error, server returned error on SASL authentication step: Authentication failed.
sink:
type: mongodb
uri: mongodb://username:password@primary.mongodb.net:27017,sec-01.mongodb.net:27017,sec-02.mongodb.net:27017/database
# timeout: 30s
# tail: false
ssl: true
replicaSet: myrepl-shard-0
authSource: admin
# cacerts: ["/path/to/cert.pem"]
# wc: 1
# fsync: false
# bulk: false
It looks like you are possibly missing some information in the mongo uri.
In my URI I had the following queries at the end ?replicaSet=<REPLICA-SET-NAME>&authSource=admin'
I'm assuming you have used the correct username and password as well?
If this still doesn't work, try removing the ssl=true
line and see whether that works.
By the way, I ended up using mongo-connector to sync mongo with Elastic Search in the end as the transporter was way too efficient and our Elastic Search cluster would crash due to the huge amount of information being sent! If you are syncing mongodb with mongodb you probably won't have that issue.
@cdimitroulas With transporter we can not use connection options with URI https://www.compose.com/articles/how-to-move-data-with-compose-transporter-from-database-to-disk/. @jipperinbham Is It possible to use these connection options with transporter replicaSet=REPLICA-SET-NAME&authSource=admin If yes then How?
@cdimitroulas @ansarizafar we use the mgo
lib for working with mongodb and it supports most URI options, https://github.com/go-mgo/mgo/blob/362ae10ff8ef82c1f099a585ffb890eff7abc88b/session.go#L284-L305
Yes all other options are working accept ssl=true.
correct, we don't support the ssl=true
parameter in the URI but is configurable through its own option ssl
(and a corresponding cacerts
)
Transporter is really great. I have just copied a database from mLab to mongodb Atlas. Documents are copied correctly. Is there a way to also copy indexes?
glad things are working for you! we've got a lot more in store for the project.
we don't copy the indexes right now as my current stance is that's something that can (and probably should) be done by hand due to the many different gotchas that can occur with creating indexes between versions of mongodb.
feel free to open a separate issue on that topic as a feature request and we can discuss there as I'm not completely opposed to the idea.
Hi guys,
Really cool open-source project you are running here! Thanks for your work.
I'm having some issues when moving data from MongoDB to ElasticSearch. We have one very big collection (379,000 documents) which may or may not be causing some issues. When running transporter it seems to move the majority of our documents across to ES but there are a number of collections (mongodb has 406,551 records, only 378,685 in es) missing and transporter exits with the following error:
When I run transporter and I exclude our large collection, it seems to run okay and gets to the point where it is tailing the MongoDB oplog. However, there is still the issue that not all the data has been moved across. The logs look like this when it is tailing the oplog:
I was wondering if there is a way to log the response from Elastic Search in order to see if there are any errors? Can this be done using the pipeline?