jasonmk / datastax_rails

A Ruby-on-Rails interface to Datastax Enterprise. Replaces the majority of ActiveRecord functionality.
MIT License
23 stars 9 forks source link

= datastax_rails

A Ruby-on-Rails interface to Datastax Enterprise (specifically DSE Search nodes). Replaces the majority of ActiveRecord functionality.

This gem is based heavily on the excellent CassandraObject gem (https://github.com/data-axle/cassandra_object) as well as some work I initially did in the form of SolandraObject (https://github.com/jasonmk/solandra_object). We made the decision to move away from Solandra and to Datastax Enterprise, thus datastax_rails was born.

Significant changes from SolandraObject:

As of 2.4.0, DSE 4.7 is supported and live indexing is enabled by default. If you are using any version prior to DSE 4.7 or do not wish to use live indexing, you will want to override the following setting:

DatastaxRails::Base.live_indexing = false

=== Usage Note

Before using this gem, you should probably take a strong look at the type of problem you are trying to solve. Cassandra is primarily designed as a solution to Big Data problems. This gem is not. You will notice that it still carries a lot of relational logic with it. We are using DSE to solve a replication problem more so than a Big Data problem. That's not to say that this gem won't work for Big Data problems, it just might not be ideal. You've been warned...

=== Getting started

First add it to your Gemfile:

gem 'datastax_rails'

Configure the config/datastax.yml file:

development: servers: ["127.0.0.1"] port: 9042 ssl: true keyspace: "_development" strategy_class: "org.apache.cassandra.locator.NetworkTopologyStrategy" strategy_options: {"DC1": "1"} connection_options: timeout: 10 solr: port: 8983 path: /solr

The above is configured to use NetworkTopologyStrategy. If you go with this, you'll need to configure Datastax to use the NetworkTopologySnitch and set up the cassandra-topology.properties file. See the Datastax documentation for more information.

For a more simple, single datacenter setup, something like this should probably work:

development: servers: ["127.0.0.1"] port: 9042 ssl: false keyspace: "datastax_rails_development" strategy_class: "org.apache.cassandra.locator.SimpleStrategy" strategy_options: {"replication_factor": "1"} connection_options: timeout: 10 solr: port: 8983 path: /solr

See DatastaxRails::Connection::ClassMethods for a description of what options are available.

Create your keyspace:

rake ds:create rake ds:create RAILS_ENV=test

Once you've created some models, the following will upload the solr schemas and create the column families:

rake ds:migrate rake ds:migrate RAILS_ENV=test

It is safe to run ds:migrate over and over. In fact, it is necessary to re-run it any time you change the attributes on any model. DSR will only upload schema files if they have changed.

Create a sample Model. See Base documentation for more details:

class Person < DatastaxRails::Base uuid :id string :first_name string :user_name text :bio date :birthdate boolean :active timestamps end

=== Different types of models

In addition to the simple model above, there are several special case models that take advantage of special features of Cassandra. For examples of using those, see the following classes:

=== Known issues

When doing a grouped query, the total groups reported is only the total groups included in your query, not the total matching groups (if you're paginating).
This is due to the way solr sharding works with grouped queries.

=== More information

The documentation for {DatastaxRails::Base} and {DatastaxRails::SearchMethods} will give you quite a few examples of things you can do. You can find a copy of the latest documentation on line at http://rdoc.info/github/jasonmk/datastax_rails/master/frames.