neumino / rethinkdbdash

An advanced Node.js driver for RethinkDB with a connection pool, support for streams etc.
MIT License
848 stars 108 forks source link

rethinkdbdash

Wercker status

A Node.js driver for RethinkDB with more advanced features.

Install

npm install rethinkdbdash

Note: The rethinkdbdash-unstable package is a relic from the past (rethinkdb < 1.13).

Quick start

Rethinkdbdash uses almost the same API as the official driver. Please refer to the official driver's documentation for all the ReQL methods (the methods used to build the query).

The main differences are:

var r = require('rethinkdbdash')();
// With the official driver:
// var r = require('rethinkdb');
var r = require('rethinkdbdash')();
r.table('users').get('orphee@gmail.com').run().then(function(user) {
  // ...
}).error(handleError)
var r = require('rethinkdbdash')();
r.table('data').run().then(function(result) {
  assert(Array.isArray(result)) // true
  // With the official driver you need to call
  // result.toArray().then(function(result2) {
  //   assert(Array.isArray(result2))
  // })
});

Drop in

You can replace the official driver with rethinkdbdash by just replacing

var r = require('rethinkdb');

With:

var r = require('rethinkdbdash')({
  pool: false,
  cursor: true
});

If you want to take advantage of the connection pool, refer to the next section.

From the official driver

To switch from the official driver to rethinkdbdash and get the most of it, here are the few things to do:

  1. Change the way to import the driver.

    var r = require('rethinkdb');

    To:

    var r = require('rethinkdbdash')();
    // Or if you do not connect to the default local instance:
    // var r = require('rethinkdbdash')({servers: [{host: ..., port: ...}]});
  2. Remove everything related to a connection:

    r.connect({host: ..., port: ...}).then(function(connection) {
    connection.on('error', handleError);
    query.run(connection).then(function(result) {
      // console.log(result);
      connection.close();
    });
    });

    Becomes:

    query.run().then(function(result) {
    // console.log(result);
    });
  3. Remove the methods related to the cursor. This typically involves removing toArray:

    r.table('data').run(connection).then(function(cursor) {
    cursor.toArray().then(function(result) {
      // console.log(result):
    });
    });

    Becomes

    r.table('data').run().then(function(result) {
    // console.log(result);
    });

Using TLS Connections

Note: Support for a TLS proxy is experimental.

RethinkDB does not support TLS connections to the server yet, but in case you want to run it over an untrusted network and need encryption, you can easily run a TLS proxy on your server with:

var tls = require('tls');
var net = require('net');
var tlsOpts = {
  key: '', // You private key
  cert: '' // Public certificate
};
tls.createServer(tlsOpts, function (encryptedConnection) {
  var rethinkdbConn = net.connect({
    host: 'localhost',
    port: 28015
  });
  encryptedConnection.pipe(rethinkdbConn).pipe(encryptedConnection);
}).listen(29015);

And then safely connect to it with the tls option:

var r = require('rethinkdbdash')({
  port: 29015,
  host: 'place-with-no-firewall.com',
  ssl: true
});

ssl may also be an object that will be passed as the options argument to tls.connect.

New features and differences

Rethinkdbdash ships with a few interesting features.

Importing the driver

When you import the driver, as soon as you execute the module, you will create a default connection pool (except if you pass {pool: false}. The options you can pass are:

In case of a single instance, you can directly pass host and port in the top level parameters.

Examples:

// connect to localhost:8080, and let the driver find other instances
var r = require('rethinkdbdash')({
    discovery: true
});

// connect to and only to localhost:8080
var r = require('rethinkdbdash')();

// Do not create a connection pool
var r = require('rethinkdbdash')({pool: false});

// Connect to a cluster seeding from `192.168.0.100`, `192.168.0.101`, `192.168.0.102`
var r = require('rethinkdbdash')({
    servers: [
        {host: '192.168.0.100', port: 28015},
        {host: '192.168.0.101', port: 28015},
        {host: '192.168.0.102', port: 28015},
    ]
});

// Connect to a cluster containing `192.168.0.100`, `192.168.0.100`, `192.168.0.102` and
// use a maximum of 3000 connections and try to keep 300 connections available at all time.
var r = require('rethinkdbdash')({
    servers: [
        {host: '192.168.0.100', port: 28015},
        {host: '192.168.0.101', port: 28015},
        {host: '192.168.0.102', port: 28015},
    ],
    buffer: 300,
    max: 3000
});

You can also pass {cursor: true} if you want to retrieve RethinkDB streams as cursors and not arrays by default.

Note: The option {stream: true} that asynchronously returns a stream is deprecated. Use toStream instead.

Note: The option {optionalRun: false} will disable the optional run for all instances of the driver.

Note: Connections are created with TCP keep alive turned on, but some routers seem to ignore this setting. To make sure that your connections are kept alive, set the pingInterval to the interval in seconds you want the driver to ping the connection.

Note: The error __rethinkdbdash_ping__ is used for internal purposes (ping). Do not use it.

Connection pool

As mentioned before, rethinkdbdash has a connection pool and manage all the connections itself. The connection pool is initialized as soon as you execute the module.

You should never have to worry about connections in rethinkdbdash. Connections are created as they are needed, and in case of a host failure, the pool will try to open connections with an exponential back off algorithm.

The driver execute one query per connection. Now that rethinkdb/rethinkdb#3296 is solved, this behavior may be changed in the future.

Because the connection pool will keep some connections available, a script will not terminate. If you have finished executing your queries and want your Node.js script to exit, you need to drain the pool with:

r.getPoolMaster().drain();

The pool master by default will log all errors/new states on stderr. If you do not want to pollute stderr, pass silent: true when you import the driver and provide your own log method.

r = require('rethinkdbdash')({
  silent: true,
  log: function(message) {
    console.log(message);
  }
});
Advanced details about the pool

The pool is composed of a PoolMaster that retrieve connections for n pools where n is the number of servers the driver is connected to. Each pool is connected to a unique host.

To access the pool master, you can call the method r.getPoolMaster().

The pool emits a few events:

You can get the number of connections (opened or being opened).

r.getPoolMaster().getLength();

You can also get the number of available connections (idle connections, without a query running on it).

r.getPoolMaster().getAvailableLength();

You can also drain the pool as mentionned earlier with;

r.getPoolMaster().drain();

You can access all the pools with:

r.getPoolMaster().getPools();

The pool master emits the healthy when its state change. Its state is defined as:

A pool being healthy is it has at least one available connection, or it was just created and opening a connection hasn't failed.

r.getPoolMaster().on('healthy', function(healthy) {
  if (healthy === true) {
    console.log('We can run queries.');
  }
  else {
    console.log('No queries can be run.');
  }
});
Note about connections

If you do not wish to use rethinkdbdash connection pool, you can implement yours. The connections created with rethinkdbdash emits a "release" event when they receive an error, an atom, or the end (or full) sequence.

A connection can also emit a "timeout" event if the underlying connection times out.

Arrays by default, not cursors

Rethinkdbdash automatically coerce cursors to arrays. If you need a raw cursor, you can call the run command with the option {cursor: true} or import the driver with {cursor: true}.

r.expr([1, 2, 3]).run().then(function(result) {
  console.log(JSON.stringify(result)) // print [1, 2, 3]
})
r.expr([1, 2, 3]).run({cursor: true}).then(function(cursor) {
  cursor.toArray().then(function(result) {
    console.log(JSON.stringify(result)) // print [1, 2, 3]
  });
})

Note: If a query returns a cursor, the connection will not be released as long as the cursor hasn't fetched everything or has been closed.

Readable streams

Readable streams can be synchronously returned with the toStream([connection]) method.

var fs = require('fs');
var file = fs.createWriteStream('file.txt');

var r = require('rethinkdbdash')();
r.table('users').toStream()
  .on('error', console.log)
  .pipe(file)
  .on('error', console.log)
  .on('end', function() {
    r.getPool().drain();
  });

Note: The stream will emit an error if you provide it with a single value (streams, arrays and grouped data work fine).

Note: null values are currently dropped from streams.

Writable and Transform streams

You can create a Writable or Transform streams by calling toStream([connection, ]{writable: true}) or toStream([connection, ]{transform: true}) on a table.

By default, a transform stream will return the saved documents. You can return the primary key of the new document by passing the option format: 'primaryKey'.

This makes a convenient way to dump a file your database.

var file = fs.createReadStream('users.json')
var table = r.table('users').toStream({writable: true});

file.pipe(transformer) // transformer would be a Transform stream that splits per line and call JSON.parse
    .pipe(table)
    .on('finish', function() {
        console.log('Done');
        r.getPool().drain();
    });

Optional run with yield

The then and catch methods are implemented on a Term - returned by any methods like filter, update etc. They are shortcut for this.run().then(callback) and this.run().catch(callback).

This means that you can yield any query without calling run.

var bluebird = require('bluebird');
var r = require('rethinkdbdash')();

bluebird.coroutine(function*() {
  try {
    var result = yield r.table('users').get('orphee@gmail.com').update({name: 'Michel'});
    assert.equal(result.errors, 0);
  } catch(err) {
    console.log(err);
  }
});

Note: You have to start Node >= 0.11 with the --harmony flag.

Global default values

You can set the maximum nesting level and maximum array length on all your queries with:

r.setNestingLevel(<number>)
r.setArrayLimit(<number>)

Undefined values

Rethinkdbdash will ignore the keys/values where the value is undefined instead of throwing an error like the official driver.

Better errors

Backtraces

If your query fails, the driver will return an error with a backtrace; your query will be printed and the broken part will be highlighted.

Backtraces in rethinkdbdash are tested and properly formatted. Typically, long backtraces are split on multiple lines and if the driver cannot serialize the query, it will provide a better location of the error.

Arity errors

The server may return confusing error messages when the wrong number of arguments is provided (See rethinkdb/rethinkdb#2463 to track progress). Rethinkdbdash tries to make up for it by catching errors before sending the query to the server if possible.

Performance

The tree representation of the query is built step by step and stored which avoid recomputing it if the query is re-run.

The code was partially optimized for v8, and is written in pure JavaScript which avoids errors like issue #2839

Run tests

Update test/config.js if your RethinkDB instance doesn't run on the default parameters.

Make sure you run a version of Node that supports generators and run:

npm test

Longer tests for the pool:

mocha long_test/discovery.js -t 50000
mocha long_test/static.js -t 50000

Tests are also being run on wercker:

FAQ

Browserify

To build the browser version of rethinkdbdash, run:

node browserify.js