zhr85210078 / node-mongodb-es-connector

nodejs mongodb elasticsearch synchrodata(mongodb和es同步数据)
https://zhr85210078.github.io/node-mongodb-es-connector/#/
MIT License
77 stars 17 forks source link
datasync elasticsearch mongo-oplog mongodb nodejs

node-mongodb-es-connector

MongoDB and ElasticSearch sync module for node (support attachment sync) structure

Supports one-to-one and one-to-many relationships.

Chinese Documentation - 中文文档

my current version

elasticsearch: v6.1.2
mongodb: v3.6.2
Nodejs: v8.9.3

What does it do

node-mongodb-es-connector package keeps your mongoDB collections and elastic search cluster in sync. It does so by tailing the mongo oplog and replicate whatever crud operation into elastic search cluster without any overhead. Please note that a replica set is needed for the package to tail mongoDB.(support attentment sync)

How to use

npm install es-mongodb-sync

or Download from GitHub.

Sample usage

Create a file in the crawlerData folder,the Naming rules is ElasticSearchIndexName.json or any name .json.

If you have more additional configuration in the crawlerData folder.

For example:

mybooks.json

{
    "mongodb": {
        "m_database": "myTest",
        "m_collectionname": "books",
        "m_filterfilds": {
            "version" : "2.0"
        },
        "m_returnfilds": {
            "bName": 1,
            "bPrice": 1,
            "bImgSrc": 1
        },
        "m_extendfilds": {
            "bA": "this is a extend fild bA",
            "bB": "this is a extend fild bB"
        },
        "m_extendinit": {
            "m_comparefild": "_id",
            "m_comparefildType": "ObjectId",
            "m_startFrom": "2018-07-20 13:44:00",
            "m_endTo": "2018-07-20 13:46:59"
        },
        "m_connection": {
            "m_servers": [
                "localhost:29031",
                "localhost:29032",
                "localhost:29033"
            ],
            "m_authentication": {
                "username": "UserAdmin",
                "password": "pass1234",
                "authsource":"admin",
                "replicaset":"my_replica",
                "ssl":false
            }
        },
        "m_documentsinbatch": 5000,
        "m_delaytime": 1000,
        "max_attachment_size":5242880
    },
    "elasticsearch": {
        "e_index": "mybooks",
        "e_type": "books",
        "e_connection": {
            "e_server": "http://localhost1:9200,http://localhost2:9200,http://localhost3:9200",
            "e_httpauth": {
                "username": "EsAdmin",
                "password": "pass1234"
            }
        },
        "e_pipeline": "mypipeline",
        "e_iscontainattachment": true
    }
}

Start up

node app.js

start

Extra APIs

index.js (only crud config json )

Example

1.start() - must start up before all the APIs.


2.addWatcher() - add a config json.

Parameters:

Name Type
fileName string
obj jsonObject

return: true or false


3.updateWatcher() - update a config json.

Parameters:

Name Type
fileName string
obj jsonObject

return: true or false


4.deleteWatcher() - delete a config json.

Parameters:

Name Type
fileName string

return: true or false


5.isExistWatcher() - check out this config json exist.

Parameters:

Name Type
fileName string

return: true or false


6.getInfoArray() - get every config status.(waiting/initialling/running/stoped).


ChangeLog

How to use pipeline

PUT _ingest/pipeline/mypipeline
{
  "description" : "Extract attachment information from arrays",
  "processors" : [
    {
      "foreach": {
        "field": "attachments",
        "processor": {
          "attachment": {
            "target_field": "_ingest._value.attachment",
            "field": "_ingest._value.data"
          }
        }
      }
    }
  ]
}

Result

mongodb

elasticsearch

Test

test

License

The MIT License (MIT). Please see LICENSE for more information.