rwynn / monstache-site

documentation for the monstache project
https://rwynn.github.io/monstache-site/
MIT License
22 stars 31 forks source link

Did not sync data after deleing or updating my mongodb #14

Closed xiangyiliu18 closed 5 years ago

xiangyiliu18 commented 5 years ago

Now. my monstache only sync mongoDB data on 'insert' action, but does not work on 'update&delete' actions

This is my conf file:

mongo-url = "mongodb://localhost:27017" elasticsearch-urls = ["http://localhost:9200"] namespace-regex = 'firewall.^' index-as-update = true gzip = true change-stream-namespaces = ['firewall.questions'] dropped-collections = true dropped-databases = true replay = false stats = true resume = true resume-name = "default" resume-write-unsafe = false verbose = true exit-after-direct-reads = false delete-index-pattern = "questions"

This is output after i run my con file(): STATS 2019/05/07 12:19:00 {"Flushed":150,"Committed":0,"Indexed":0,"Created":0,"Updated":0,"Deleted":0,"Succeeded":0,"Failed":0,"Workers":[{"Queued":0,"LastDuration":0},{"Queued":0,"LastDuration":0},{"Queued":0,"LastDuration":0},{"Queued":0,"LastDuration":0}]} STATS 2019/05/07 12:19:30 {"Flushed":179,"Committed":0,"Indexed":0,"Created":0,"Updated":0,"Deleted":0,"Succeeded":0,"Failed":0,"Workers":[{"Queued":0,"LastDuration":0},{"Queued":0,"LastDuration":0},{"Queued":0,"LastDuration":0},{"Queued":0,"LastDuration":0}]} STATS 2019/05/07 12:20:00 {"Flushed":210,"Committed":0,"Indexed":0,"Created":0,"Updated":0,"Deleted":0,"Succeeded":0,"Failed":0,"Workers":[{"Queued":0,"LastDuration":0},{"Queued":0,"LastDuration":0},{"Queued":0,"LastDuration":0},{"Queued":0,"LastDuration":0}]}

rwynn commented 5 years ago

Hi @xiangyiliu18

Can you try removing the line starting namespace-regex from your config file. I don't think you will need this filter as you are already only targeting one specific collection using change-stream-namespaces.

Additionally you can remove delete-index-pattern or set it to firewall.questions. Since you don't have any index name mappings, your MongoDB collection at firewall.questions will be sent to an index with the same name.

Finally, it would help if you could share the version numbers of monstache, Elasticsearch, and MongoDB. And show the full output of monstache from the start.

xiangyiliu18 commented 5 years ago

Thanks for your response. I will try what you recommend right now monstache version: 3.24.2 elasticsearch: "number" : "2.3.1" MongoDB: MongoDB shell version v4.0.9 And for my MondDB, i only created one replSet(Primary) and i am not be able to use mongod --master

monstache -print-config: INFO 2019/05/07 14:09:40 { "EnableTemplate": false, "EnvDelimiter": ",", "MongoURL": "localhost", "MongoConfigURL": "", "MongoPemFile": "", "MongoValidatePemFile": true, "MongoOpLogDatabaseName": "", "MongoOpLogCollectionName": "", "MongoDialSettings": { "Timeout": 15, "Ssl": false, "ReadTimeout": 30, "WriteTimeout": 30 }, "MongoSessionSettings": { "SocketTimeout": 0, "SyncTimeout": 30 }, "MongoX509Settings": { "ClientCertPemFile": "", "ClientKeyPemFile": "" }, "GtmSettings": { "ChannelSize": 512, "BufferSize": 32, "BufferDuration": "75ms" }, "AWSConnect": { "AccessKey": "", "SecretKey": "", "Region": "" }, "Logs": { "Info": "", "Warn": "", "Error": "", "Trace": "", "Stats": "" }, "GraylogAddr": "", "ElasticUrls": null, "ElasticUser": "", "ElasticPassword": "", "ElasticPemFile": "", "ElasticValidatePemFile": true, "ElasticVersion": "", "ElasticHealth0": 15, "ElasticHealth1": 5, "ResumeName": "default", "NsRegex": "", "NsDropRegex": "", "NsExcludeRegex": "", "NsDropExcludeRegex": "", "ClusterName": "", "Print": true, "Version": false, "Pprof": false, "DisableChangeEvents": false, "EnableEasyJSON": false, "Stats": false, "IndexStats": false, "StatsDuration": "", "StatsIndexFormat": "monstache.stats.2006-01-02", "Gzip": false, "Verbose": false, "Resume": false, "ResumeWriteUnsafe": false, "ResumeFromTimestamp": 0, "Replay": false, "DroppedDatabases": true, "DroppedCollections": true, "IndexFiles": false, "IndexAsUpdate": false, "FileHighlighting": false, "EnablePatches": false, "FailFast": false, "IndexOplogTime": false, "OplogTsFieldName": "oplog_ts", "OplogDateFieldName": "oplog_date", "OplogDateFieldFormat": "2006/01/02 15:04:05", "ExitAfterDirectReads": false, "MergePatchAttr": "json-merge-patches", "ElasticMaxConns": 4, "ElasticRetry": false, "ElasticMaxDocs": -1, "ElasticMaxBytes": 8388608, "ElasticMaxSeconds": 1, "ElasticClientTimeout": 0, "ElasticMajorVersion": 0, "ElasticMinorVersion": 0, "MaxFileSize": 0, "ConfigFile": "", "Script": null, "Filter": null, "Pipeline": null, "Mapping": null, "Relate": null, "FileNamespaces": null, "PatchNamespaces": null, "Workers": null, "Worker": "", "ChangeStreamNs": null, "DirectReadNs": null, "DirectReadSplitMax": 0, "DirectReadConcur": 0, "MapperPluginPath": "", "EnableHTTPServer": false, "HTTPServerAddr": ":8080", "TimeMachineNamespaces": null, "TimeMachineIndexPrefix": "log", "TimeMachineIndexSuffix": "2006-01-02", "TimeMachineDirectReads": false, "PipeAllowDisk": false, "RoutingNamespaces": null, "DeleteStrategy": 0, "DeleteIndexPattern": "*", "ConfigDatabaseName": "monstache", "FileDownloaders": 0, "RelateThreads": 10, "RelateBuffer": 1000, "PostProcessors": 0, "PruneInvalidJSON": false, "Debug": false }

xiangyiliu18 commented 5 years ago

I have no idea why i got another error loop after i run my conf file.

ubuntu@mongodb:~$ monstache -f mongo-elastic.toml INFO 2019/05/07 14:16:48 Started monstache version 3.24.2 INFO 2019/05/07 14:16:48 Successfully connected to MongoDB version 4.0.9 INFO 2019/05/07 14:16:48 Successfully connected to Elasticsearch version 2.3.1 INFO 2019/05/07 14:16:48 Sending systemd READY=1 WARN 2019/05/07 14:16:48 Systemd notification not supported (i.e. NOTIFY_SOCKET is unset) INFO 2019/05/07 14:16:48 Listening for events INFO 2019/05/07 14:16:48 Watching changes on collection firewall.questions TRACE 2019/05/07 14:16:49 POST /_bulk HTTP/1.1 Host: localhost:9200 User-Agent: elastic/5.0.76 (linux-amd64) Transfer-Encoding: chunked Accept: application/json Content-Encoding: gzip Content-Type: application/x-ndjson Vary: Accept-Encoding Accept-Encoding: gzip

64 ??VJI?I-IU??V??LQ?R2MN1?H5OK6?L504?01315H17R?Q???KI?P?R*,M-.???+ ?T?"?Ajk???L?I\ 0

ERROR 2019/05/07 14:16:49 elastic: http://localhost:9200 is dead ERROR 2019/05/07 14:16:49 elastic: bulk processor "monstache" failed: Post http://localhost:9200/_bulk: read tcp 127.0.0.1:33976->127.0.0.1:9200: read: connection reset by peer ERROR 2019/05/07 14:16:49 elastic: all 1 nodes marked as dead; resurrecting them to prevent deadlock ERROR 2019/05/07 14:16:49 elastic: bulk processor "monstache" is waiting for an active connection ERROR 2019/05/07 14:16:49 elastic: bulk processor "monstache" failed: no available connection: no Elasticsearch node available ERROR 2019/05/07 14:16:49 elastic: bulk processor "monstache" is waiting for an active connection TRACE 2019/05/07 14:16:54 POST /_bulk HTTP/1.1 Host: localhost:9200 User-Agent: elastic/5.0.76 (linux-amd64) Transfer-Encoding: chunked Accept: application/json Content-Encoding: gzip Content-Type: application/x-ndjson Vary: Accept-Encoding Accept-Encoding: gzip

64 ??VJI?I-IU??V??LQ?R2MN1?H5OK6?L504?01315H17R?Q???KI?P?R*,M-.???+ ?T?"?Ajk???L?I\ 0

ERROR 2019/05/07 14:16:54 elastic: http://localhost:9200 is dead ERROR 2019/05/07 14:16:54 elastic: bulk processor "monstache" failed: Post http://localhost:9200/_bulk: EOF ERROR 2019/05/07 14:16:54 elastic: bulk processor "monstache" is waiting for an active connection ERROR 2019/05/07 14:16:54 elastic: all 1 nodes marked as dead; resurrecting them to prevent deadlock ERROR 2019/05/07 14:16:54 elastic: bulk processor "monstache" failed: no available connection: no Elasticsearch node available ERROR 2019/05/07 14:16:54 elastic: bulk processor "monstache" is waiting for an active connection TRACE 2019/05/07 14:16:59 POST /_bulk HTTP/1.1 Host: localhost:9200 User-Agent: elastic/5.0.76 (linux-amd64) Transfer-Encoding: chunked Accept: application/json Content-Encoding: gzip Content-Type: application/x-ndjson Vary: Accept-Encoding Accept-Encoding: gzip

64 ??VJI?I-IU??V??LQ?R2MN1?H5OK6?L504?01315H17R?Q???KI?P?R*,M-.???+ ?T?"?Ajk???L?I\ 0

ERROR 2019/05/07 14:16:59 elastic: http://localhost:9200 is dead ERROR 2019/05/07 14:16:59 elastic: bulk processor "monstache" failed: Post http://localhost:9200/_bulk: EOF ERROR 2019/05/07 14:16:59 elastic: bulk processor "monstache" is waiting for an active connection ERROR 2019/05/07 14:16:59 elastic: all 1 nodes marked as dead; resurrecting them to prevent deadlock ERROR 2019/05/07 14:16:59 elastic: bulk processor "monstache" failed: no available connection: no Elasticsearch node available ERROR 2019/05/07 14:16:59 elastic: bulk processor "monstache" is waiting for an active connection TRACE 2019/05/07 14:17:04 POST /_bulk HTTP/1.1 Host: localhost:9200 User-Agent: elastic/5.0.76 (linux-amd64) Transfer-Encoding: chunked Accept: application/json Content-Encoding: gzip Content-Type: application/x-ndjson Vary: Accept-Encoding Accept-Encoding: gzip

rwynn commented 5 years ago

@xiangyiliu18 the connection with MongoDB seems OK. However, there seems to be some problem with Elasticsearch. If you make the following change then you will be able to read the body of the request that monstache is sending:

gzip = false

Is 2.3.1 as detected by monstache the correct version of your Elasticsearch server? You might look into the Elasticsearch logs to see if there are any problems with it. Also try to check the health and resolve if status red.

xiangyiliu18 commented 5 years ago

Thanks. I changed gzip value into false. Now it can works. But the another issue is: every time when i insert one data, after around 2s my data will be duplicated with some fields, not all. And if i did update/delete action, there's only one would be modified/deleted, the another just keep the same. For example. I add userData1, then userData1 will be imported into my elasticsearch twice. If i delete userData1, only one are going to be deleted from elasticsearch. And by the way, if i need to use for production environment, which parts do i need to add?

xiangyiliu18 commented 5 years ago

I did solved it. I deleted "delete-index-pattern". It works fine right now.

  1. But did you have any suggestions about the production environment, please?. I need to handle large requests and data.
  2. And if i modify my elasticsearch through nodejs, is it possible to sync data into mongo also?
rwynn commented 5 years ago

@xiangyiliu18

  1. You might want to read the docs and experiment with the settings to reach the desired performance in production. Monstache does bulk indexing so it will buffer by default up to 8MB of data before sending. If 1s by default passes it will send whatever data it currently has.
  2. Since you have set index-as-update = true that means monstache will use an upsert when sending data to Elasticsearch. Fields that overlap between MongoDB and Elasticsearch will be overwritten. Fields that you have added through nodejs, if they don't overlap, should remain intact.
  3. Monstache only handles one way sync from MongoDB to Elasticsearch. Updates to Elasticsearch do not feed back into MongoDB.
xiangyiliu18 commented 5 years ago

thanks