pauldowman / gitmodel

(Old/dead) An ActiveModel-compliant persistence framework for Ruby that uses Git for versioning and remote syncing.
MIT License
531 stars 34 forks source link

GitModel: distributed, versioned NoSQL for Ruby

http://github.com/pauldowman/gitmodel

GitModel is an ActiveModel-compliant persistence framework for Ruby that uses Git for versioning and remote syncing.

GitModel persists Ruby objects using Git as a data storage engine. It's an ActiveModel implementation so it works stand-alone or in Rails 3 as a drop-in replacement for ActiveRecord or DataMapper.

Because the database is a Git repository it can be synced across multiple machines, manipulated with standard Git client tools, can be branched and merged, and of course keeps the history of all changes.

Why it's awesome

Status

It is not yet production ready but I'm working on it. Please feel free to contribute tests and/or code to help!

I will attempt to follow Semantic Versioning so 1.0.0 will be considered the first stable release, until then the API may change at any time.

See the "To do" section below for details, but the main thing that needs finishing is support for querying. Right now you can find an instance by it's id, but there is incomplete support (90% complete) for querying, e.g.:

Post.find(:category => 'ruby', :date => lambda{|d| d > 1.month.ago} :order_by => :date, :order => :asc, :limit => 5)

This includes support for indexing all attributes so that queries don't need to load every object.

Installation

It's available as a RubyGem:

> gem install gitmodel

Usage

GitModel.db_root = '/tmp/gitmodel-data'
GitModel.create_db!

class Post
  include GitModel::Persistable

  attribute :title
  attribute :body
  attribute :categories, :default => []
  attribute :allow_comments, :default => true

  blob :image
end

p1 = Post.new(:id => 'lessons-learned', :title => 'Lessons learned', :body => '...')
p1.image = some_binary_data
p1.save!

p = Post.find('lessons-learned')

p2 = Post.new(:id => 'hotdog-eating-contest', :title => 'I won!')
p2.body = 'This weekend I won a hotdog eating contest!'
p2.image = some_binary_data
p2.blobs['hotdogs.jpg'] = some_binary_data
p2.blobs['the-aftermath.jpg'] = some_binary_data
p2.save!

p3 = Post.create!(:id => 'running-with-scissors', :title => 'Running with scissors', :body => '...')

p4 = Post.find('running-with-scissors')

class Comment
  include GitModel::Persistable
  attribute :text
end

c1 = Comment.create!(:id => '2010-01-03-328', :text => '...')
c2 = Comment.create!(:id => '2010-05-29-742', :text => '...')

An example of a project that uses GitModel is Balisong, a blogging app for coders (but it doesn't save objects to the data store. It's read-only so far, assuming that posts will be edited with a text editor).

Database file structure

The database is stored in a human-editable format. Simply do "git checkout -f" and you'll see directories and files.

Each type of object is stored in a top-level directory (this is analogous to ActiveRecord tables), and each object is stored in a subdirectory which is named using the object's id (i.e. the primary key). Attributes that are Ruby types (strings, numbers, hashes, arrays, whatever) are stored in a file named attributes.json and binary attributes ("blobs") are stored in their own files.

For example, the database for the example above would have a directory structure that looks like this:

Performance

GitModel supports memcached for query results. This is off by default, but can be configured like this:

GitModel.memcache_servers(['server_1', 'server_2', ...])
GitModel.memcache_namespace('optional_namespace')

The namespace is optional, and usually not necessary because GitModel will prepend the last segment of GitModel.db_root anyway.

A Git SHA is also prepended to every key, so that outdated versions will not be retrieved from the cache. This is the SHA of the latest commit so unfortunately this is only useful when there are not frequent commits because every commit invalidates the cache. (This is obviously not ideal and I'm sure it can be improved upon.)

There is still a lot of work to be done to make it faster. First, some analysis is required, but some guesses about things that would help are:

Contributing

Do you have an improvement to make? Please submit a pull request on GitHub or a patch, including a test written with RSpec. To run all tests simply run bundle exec autotest.

The main author is Paul Dowman (@pauldowman).

Thanks to everyone who has contributed so far:

To do

Bugs