aronluigi / sails-cbes

MIT License
10 stars 5 forks source link

image_squidhome@2x.png

Couchbase ElasticSearch sails js adaptor

Provides easy access to couchbase and elasticsearch from Sails.js & Waterline.

This module is a Waterline/Sails adapter. Its goal is to provide a set of declarative interfaces, conventions, and best-practices for integrating with all sorts of data sources. Not just databases-- external APIs, proprietary web services, or even hardware.

For Go lovers - go-cbes

Installation

To install this adapter, run:

$ npm install sails-cbes

Before start keep in mind that

Model with elastic search mapping example:

module.exports = {
    attributes: {
        firstName: 'string',
        lastName: 'string',
        email: {
            type: 'string',
            defaultsTo: 'e@test.com'
        },
        avatar: 'binary',
        title: 'string',
        phone: 'string',
        type: 'string',
        favoriteFruit: {
            defaultsTo: 'blueberry',
            type: 'string'
        },
        age: 'integer', // integer field that's not auto-incrementable
        dob: 'datetime',
        status: {
            type: 'boolean',
            defaultsTo: false
        },
        percent: 'float',
        list: 'array',
        obj: 'json',
        fullName: function () {
            return this.firstName + ' ' + this.lastName;
        }
    },

    mapping: {
        "_all": {
            "enabled": false
        },
        firstName: {
            type: 'string',
            analyzer: 'whitespace',
            fields: {
                raw: {
                    type: 'string',
                    index: 'not_analyzed'
                }
            }
        },
        lastName: {
            type: 'string',
            analyzer: 'whitespace'
        },
        email: {
            type: 'string',
            analyzer: 'standard'
        },
        avatar: {
            type: 'binary'
        },
        title: {
            type: 'string',
            analyzer: 'whitespace',
        },
        phone: {
            type: 'string',
            analyzer: 'keyword'
        },
        type: {
            type: 'string',
            analyzer: 'keyword'
        },
        favoriteFruit: {
            type: 'string',
            analyzer: 'whitespace'
        },
        age: {
            type: 'integer',
            index: 'not_analyzed'
        },
        createdAt: {
            type: 'date',
            format: 'dateOptionalTime'
        },
        updatedAt: {
            type: 'date',
            format: 'dateOptionalTime'
        },
        status: {
            type: 'boolean'
        },
        percent: {
            type: 'float'
        },
        obj: {
            type: 'object'
        }
    }
};

Configuration

{
    //couchbase
    cb: {
        host: 'localhost',
        port: 8091,
        user: 'user',
        version: '3.0.3',
        pass: 'password',
        operationTimeout: 60 * 1000, // 60s

        bucket: {
            name: 'bucket',
            pass: 'bucketPassword'
        }
    },

    //elasticsearch
    es: {
        host: ['127.0.0.1:9200'],
        log: 'error',
        index: 'index',
        numberOfShards: 5,
        requestTimeout: 30000,
        numberOfReplicas: 1
    }
},

Usage

This adapter exposes the following methods:

find()

This method accepts Elastic Search filtered query. Only send the filtered.filter part of the query!

var elasticsearchFilterQuery = {
    bool: {
        must: [
            {
                term: {
                    type: 'createEach'
                }
            },
            {
                terms: {
                    firstName: ['createEach_1', 'createEach_2']
                }
            }
        ]
    }
};

Semantic.User.find()
    .where(elasticsearchFilterQuery)
    .skip(0)
    .limit(10)
    .sort({createdAt: 'desc'})
    .exec(function(err, res){
        // do something
    });

If you dont set no query to the find() method, find() will use couchbase view and return the entire collection.

This is the generated Elastic Search query for the above example:

query: {
    filtered: {
        query: {
            bool: {
                must: [{
                    term: {
                        _type: {
                            value: modelType
                        }
                    }
                }]
            }
        },
        filter: {
            bool: {
                must: [
                    {
                        term: {
                            type: 'createEach'
                        }
                    },
                    {
                        terms: {
                            firstName: ['createEach_1', 'createEach_2']
                        }
                    }
                ]
            }
        }
    },
    size: 10,
    from: 0,
    sort: [
        {
            createdAt: {
                order: 'desc'
            }
        }
    ]
}
where()

if your query is a 'or' query, you must use 'OR'. Please see the example below:

var query = {
    OR: {
        filters: []
    }
};
findOne()

This method accepts Elastic Search filtered query. Only send the filtered.filter part of the query!

var elasticsearchFilterQuery = {
    bool: {
        must: [
            {
                term: {
                    type: 'findOne'
                }
            }
        ]
    }
};

Semantic.User.findOne(elasticsearchFilterQuery).exec(function(err, res){
    // do something
});
create()
Semantic.User.create({ firstName: 'createEach_1', type: 'createEach' }, function(err, res) {
    // do something
})

Create document with custom ID

You must set the model "ID" attribute!

Model example:

    attributes: {
        _ID_: 'string',
        firstName: 'string',
        lastName: 'string',
        email: {
            type: 'string',
            defaultsTo: 'e@test.com'
        }
    },

    mapping: {
        "_all": {
            "enabled": false
        },
        firstName: {
            type: 'string',
            analyzer: 'whitespace',
            fields: {
                raw: {
                    type: 'string',
                    index: 'not_analyzed'
                }
            }
        },
        lastName: {
            type: 'string',
            analyzer: 'whitespace'
        },
        email: {
            type: 'string',
            analyzer: 'standard'
        }
    }
Semantic.User.create({_ID_: 'testCustomID123'}, function(err, users) {
    // do something
});
createEach()
var usersArray = [
    { firstName: 'createEach_1', type: 'createEach' },
    { firstName: 'createEach_2', type: 'createEach' }
];
Semantic.User.createEach(usersArray, function(err, res) {
    // do something
})
update()

This method accepts Elastic Search filtered query. Only send the filtered.filter part of the query!

Check find() method.

var elasticsearchFilterQuery = {
    bool: {
        must: [
            {
                term: {
                    type: 'update'
                }
            },
            {
                term: {
                    firstName: 'update_1'
                }
            }
        ]
    }
};

Semantic.User.update(elasticsearchFilterQuery, {lastName: 'updated'}).exec(function(err, res){
    // do something
});
destroy()

This method accepts Elastic Search filtered query. Only send the filtered.filter part of the query!

Check find() method.

var elasticsearchFilterQuery = {
    bool: {
        must: [
            {
                term: {
                    type: 'getRawCollection'
                }
            }
        ]
    }
};

Semantic.User.destroy(elasticsearchFilterQuery).limit(999999).exec(function(err, res){
    // do something
});
getRawCollection()

This method returns raw data from Couchbase view.

Semantic.User.getRawCollection(function(err, res){
    // do something
});
reindex()

This method synchronizes couchbase and elasticsearch by dropping the mapping (along with the entries) from elasticsearch and reimporting them from couchbase.

Semantic.User.reindex(function(err){
    // do something
});
aggregate()

This method returns the aggregation results according to the provided query (and aggregation specification). Read mode about Elasticsearch aggregations here. Unlike the Elasticsearch implementation, aggregations object should reside in the first layer within the query object (as opposed to side-by-side) and only the "aggs" key is recognized ("aggregations" will not work). Note: the result is the unmodified JSON output of Elasticsearch

Example usage:


var query = {
  "where" : {
    "and" : [
      {
        "or" : [
          {
            "term" : {
              "country" : "es"
            }
          },
          {
            "term" : {
              "country" : "pl"
            }
          }
        ]
      }
    ]
  }
}

var aggregations = {
  "account" : {
    "terms" : {
      "field" : "accountNumber"
    }
  },
  "currency" : {
    "terms" : {
      "field" : "currency"
    }
  }
}

query["aggs"] = aggregations;

Transaction.aggregate(query, function(err, res) {
  if (!err) ...
});

Backup and Restore

backup()
var _options = {
    bucketSource: 'someBucket',
    user: 'username',
    version: '2.5.1',
    password: '1231%)_',
    threads: '4',
    cbUrl: 'http://...'
    mode: 'full',
    backupPath: 'backupPath'
};

Semantic.User.backup(_options, function(err, stderr){
    // do something
});

In the above example all of the option params would be taken from the sails connection config except for the backupPath (which becomes thus a required parameter) and mode, which only works in version 3.0.x. The backup function called from any model will create the backup for the whole bucket. For more information read the cbbackup documentation.

restore()

This method restore a full backup of the entire collection to couchbase and elasticsearch.

var _options = {
    backupPath: 'backupPath'
};

Semantic.User.restore(options, reindexDelay, function(err, stderror){
    // do something
});

The reindexDelay parameter is user to delay the reindex of every bucket on elasticsearch.

For more information read the cbrestore documentation.

Document expiration (ttl)

In order to use the document expiration functionality, the model should contain an additional attribute, "_ttl", as in the following example:

module.exports = {
    connection: 'sailsCbes',
    attributes: {
        foo: {
            type: 'string',
            defaultsTo: 'bar'
        },
        _ttl: {
            type: 'int',
            defaultsTo: 1000 * 60 * 10 // 10 min
        }
    }
    mapping:{
        foo : {
            type : 'string',
            analyzer : 'standard',
            index : 'analyzed'
        }
    }
};

The default value for ttl must be specified like in the above example. A value of 0 means that by default the document does not expire.

Then the expiration timer can be specified for each document as follows:

var data = {
    foo  : 'newBar',
    _ttl : 1000 * 180
};
waterlineModel.create(data).exec(callback);

Sorting by script

To use this you first have to add the following line to your elasticsearch configuration file:

script.disable_dynamic: false

For more advanced sorting functionality you can use the Elasticsearch sorting by script method. Example:

Semantic.User.find()
    .where(elasticsearchFilterQuery)
    .skip(0)
    .limit(10)
    .sort({
        "_script": {
            "script": "doc['amount'].value / (exp(doc['decimals'].value * log(10)))",
            "lang": "groovy",
            "type": "number",
            "order": ord
        }
    })
    .exec(function(err, res){
        // do something
    });

Development

Check out Connections in the Sails docs, or see the config/connections.js file in a new Sails project for information on setting up adapters.

Running the tests

In your adapter's directory, run:

$ npm test

More Resources

License

MIT

Kreditech

© 2015 Kreditech / aronluigi & [contributors] Mohammad Bagheri, Robert Savu, Tiago Amorim & contributors

Sails is free and open-source under the MIT License