compose / transporter

Sync data between persistence engines, like ETL only not stodgy
https://github.com/compose/transporter/issues/523
BSD 3-Clause "New" or "Revised" License
1.45k stars 213 forks source link

Log responses from Elastic Search #291

Closed cdimitroulas closed 7 years ago

cdimitroulas commented 7 years ago

Hi guys,

Really cool open-source project you are running here! Thanks for your work.

I'm having some issues when moving data from MongoDB to ElasticSearch. We have one very big collection (379,000 documents) which may or may not be causing some issues. When running transporter it seems to move the majority of our documents across to ES but there are a number of collections (mongodb has 406,551 records, only 378,685 in es) missing and transporter exits with the following error:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x8bc08e]

goroutine 110 [running]:
panic(0xc36e40, 0xc42000e0e0)
        /usr/local/Cellar/go@1.7/1.7.5/libexec/src/runtime/panic.go:500 +0x1a1
github.com/compose/transporter/pkg/adaptor/elasticsearch/clients/v5.(*Writer).postBulkProcessor(0xc4206f2270, 0x199, 0xc421cc4000, 0x2c, 0x40, 0x0, 0x1263000, 0xc42230e6c0)
        /Users/JP/gocode/src/github.com/compose/transporter/pkg/adaptor/elasticsearch/clients/v5/writer.go:104 +0xfe
github.com/compose/transporter/pkg/adaptor/elasticsearch/clients/v5.(*Writer).(github.com/compose/transporter/pkg/adaptor/elasticsearch/clients/v5.postBulkProcessor)-fm(0x199, 0xc421cc4000, 0x2c, 0x40, 0x0, 0x1263000, 0xc42230e6c0)
        /Users/JP/gocode/src/github.com/compose/transporter/pkg/adaptor/elasticsearch/clients/v5/writer.go:59 +0x73
github.com/compose/transporter/vendor/gopkg.in/olivere/elastic%2ev5.(*bulkWorker).commit(0xc420500000, 0x7f2ff414d428, 0xc42000e520, 0x1, 0xc4206380b0)
        /Users/JP/gocode/src/github.com/compose/transporter/vendor/gopkg.in/olivere/elastic.v5/bulk_processor.go:507 +0x360
github.com/compose/transporter/vendor/gopkg.in/olivere/elastic%2ev5.(*bulkWorker).work(0xc420500000, 0x7f2ff414d428, 0xc42000e520)
        /Users/JP/gocode/src/github.com/compose/transporter/vendor/gopkg.in/olivere/elastic.v5/bulk_processor.go:443 +0x236
created by github.com/compose/transporter/vendor/gopkg.in/olivere/elastic%2ev5.(*BulkProcessor).Start
        /Users/JP/gocode/src/github.com/compose/transporter/vendor/gopkg.in/olivere/elastic.v5/bulk_processor.go:300 +0x34f

When I run transporter and I exclude our large collection, it seems to run okay and gets to the point where it is tailing the MongoDB oplog. However, there is still the issue that not all the data has been moved across. The logs look like this when it is tailing the oplog:

INFO[3195] tailing oplog with query map[ts:map[$gte:6394692187568734208]]
  db=main
INFO[3198] Ping for <censored_database_url>:27017 is 17 ms 
INFO[3198] Ping for <censored_database_url> is 20 ms 
INFO[3199] Ping for <censored_database_url> is 24 ms 
INFO[3204] SYNC Starting full topology synchronization... 
INFO[3204] SYNC Processing <censored_database_url>...
INFO[3204] SYNC Processing <censored_database_url>... 
INFO[3204] SYNC Processing <censored_database_url>... 
INFO[3205] SYNC Synchronization was complete (got data from primary). 
INFO[3205] SYNC Synchronization completed: 1 master(s) and 2 slave(s) alive.

I was wondering if there is a way to log the response from Elastic Search in order to see if there are any errors? Can this be done using the pipeline?

jipperinbham commented 7 years ago

looks like it is getting an error from elasticsearch but fails when trying to gather information to log here.

I'll see if I can't get that logic fixed up today.

cdimitroulas commented 7 years ago

Awesome, thanks. Let me know if you need any more information

jipperinbham commented 7 years ago

@cdimitroulas are you able to build transporter locally to test this?

cdimitroulas commented 7 years ago

sure, I have go installed so I should be able to do it with some instructions. What are the steps to build transporter after cloning it locally?

jipperinbham commented 7 years ago

great! we've got some instructions here but the gist for testing this specific fix is

git checkout 291-log-panic
go build ./cmd/transporter/...

then you'll have a transporter binary in your current directory, from there you should be able to run:

./transporter run -config=/path/to/transporter.yaml /path/to/application.js
cdimitroulas commented 7 years ago

Thanks for that

FYI I am using Mongo Atlas with WiredTiger and Elastic Search v5.2

cdimitroulas commented 7 years ago

Just ran it on the 291-log-panic branch and saw this error after 378,875 records:

ERRO[0959] Post https://8a136e9f66cc0ab4bedecf5a14e5ca09.eu-west-1.aws.found.io:9243/_bulk: net/http: request canceled (Client.Timeout exceeded while awaiting headers)  executionID=399 path="mongodb/es" version=5 writer=elasticsearch

Looks like I just need to increase the timeout?

jipperinbham commented 7 years ago

yea, looks like that would help, although I wouldn't expect a bulk request to take longer than 30 seconds.

cdimitroulas commented 7 years ago

Now the following errors were logged:

ERRO[1487] elastic: Error 502 (Bad Gateway)              executionID=403 path="mongodb/es" version=5 writer=elasticsearch
ERRO[1496] elastic: Error 502 (Bad Gateway)              executionID=404 path="mongodb/es" version=5 writer=elasticsearch
ERRO[1496] elastic: Error 502 (Bad Gateway)              executionID=405 path="mongodb/es" version=5 writer=elasticsearch

It's possible that because I am using Elastic Cloud and the lowest tier of their cluster that it does not have enough resources to deal with the influx of data. Is there any way to reduce the speed at which the transporter makes requests?

jipperinbham commented 7 years ago

we don't have a way to lower the bulk rate right now, feel free to open another issue (or submit a PR) to get it added.

cdimitroulas commented 7 years ago

alright, thanks a lot for your help. The error reporting from the elasticsearch writer seems to be working fine though!

jipperinbham commented 7 years ago

👍 I've merged the fix into master and we may try and get a patch release out depending on other bugs reported before the next minor release.

ansarizafar commented 7 years ago

@cdimitroulas How do you define mongodb atlas settings in transport.yaml. Here is my settings but I am getting error

connection error, server returned error on SASL authentication step: Authentication failed.

 sink:
    type: mongodb
    uri: mongodb://username:password@primary.mongodb.net:27017,sec-01.mongodb.net:27017,sec-02.mongodb.net:27017/database
    # timeout: 30s
    # tail: false
    ssl: true
    replicaSet: myrepl-shard-0
    authSource: admin
    # cacerts: ["/path/to/cert.pem"]
    # wc: 1
    # fsync: false
    # bulk: false
cdimitroulas commented 7 years ago

It looks like you are possibly missing some information in the mongo uri. In my URI I had the following queries at the end ?replicaSet=<REPLICA-SET-NAME>&authSource=admin'

I'm assuming you have used the correct username and password as well? If this still doesn't work, try removing the ssl=true line and see whether that works.

By the way, I ended up using mongo-connector to sync mongo with Elastic Search in the end as the transporter was way too efficient and our Elastic Search cluster would crash due to the huge amount of information being sent! If you are syncing mongodb with mongodb you probably won't have that issue.

ansarizafar commented 7 years ago

@cdimitroulas With transporter we can not use connection options with URI https://www.compose.com/articles/how-to-move-data-with-compose-transporter-from-database-to-disk/. @jipperinbham Is It possible to use these connection options with transporter replicaSet=REPLICA-SET-NAME&authSource=admin If yes then How?

jipperinbham commented 7 years ago

@cdimitroulas @ansarizafar we use the mgo lib for working with mongodb and it supports most URI options, https://github.com/go-mgo/mgo/blob/362ae10ff8ef82c1f099a585ffb890eff7abc88b/session.go#L284-L305

ansarizafar commented 7 years ago

Yes all other options are working accept ssl=true.

jipperinbham commented 7 years ago

correct, we don't support the ssl=true parameter in the URI but is configurable through its own option ssl (and a corresponding cacerts)

ansarizafar commented 7 years ago

Transporter is really great. I have just copied a database from mLab to mongodb Atlas. Documents are copied correctly. Is there a way to also copy indexes?

jipperinbham commented 7 years ago

glad things are working for you! we've got a lot more in store for the project.

we don't copy the indexes right now as my current stance is that's something that can (and probably should) be done by hand due to the many different gotchas that can occur with creating indexes between versions of mongodb.

feel free to open a separate issue on that topic as a feature request and we can discuss there as I'm not completely opposed to the idea.