mirage / irmin

Irmin is a distributed database that follows the same design principles as Git
https://irmin.org
ISC License
1.85k stars 157 forks source link
database git irmin mirageos ocaml storage
Irmin logo
A Distributed Database Built on the Same Principles as Git

[![OCaml-CI Build Status](https://img.shields.io/endpoint?url=https%3A%2F%2Fci.ocamllabs.io%2Fbadge%2Fmirage%2Firmin%2Fmain&logo=ocaml&style=flat-square)](https://ci.ocamllabs.io/github/mirage/irmin) [![codecov](https://codecov.io/gh/mirage/irmin/branch/main/graph/badge.svg?token=n4mWfgURqT)](https://codecov.io/gh/mirage/irmin) [![GitHub release (latest by date)](https://img.shields.io/github/v/release/mirage/irmin?style=flat-square&color=09aa89)](https://github.com/mirage/irmin/releases/latest) [![docs](https://img.shields.io/badge/doc-online-blue.svg?style=flat-square)](https://mirage.github.io/irmin/)

Irmin is an OCaml library for building mergeable, branchable distributed data stores.

Irmin is based on distributed version-control systems (DVCs), extensively used in software development to track data provenance and show modifications in the source code. Irmin applies DVC's principles to large-scale distributed data and includes similar functions to Git (clone, push, pull, branch, rebase). The Git workflow was initially designed for humans to manage changes within source code. Irmin scales this to handle automatic programs performing a very high number of operations per second, with fully-automated conflict handling.

Irmin is highly customisable. Users can define their types to store application-specific values. They can also define custom storage layers (in memory, on disk, in a remote Redis database, in the browser, etc.). Finally, Irmin contains an event-driven API to define programmable dynamic behaviours and to program distributed dataflow pipelines.

Irmin was created at the University of Cambridge in 2013 to be the default storage layer for MirageOS applications (both to store and orchestrate unikernel binaries and the data that these unikernels are using). As such, Irmin is not, strictly speaking, a complete database engine. Instead, similarly to other MirageOS components, it is a collection of libraries designed to solve different flavours of the challenges raised by the CAP Theorem. Each application can select the right combination of libraries to solve its particular distributed problem.

Irmin is built on a core of well-defined, low-level data structures that dictate how data should be persisted and shared across nodes. It defines algorithms for efficient synchronisation of those distributed low-level constructs. It also builds a collection of higher-level data structures that developers can use without knowing precisely how Irmin works underneath. Some of these components even have formal semantics, including [Conflict-free Replicated Data-Types (CRDT)][]. Since it's a part of MirageOS, Irmin does not make strong assumptions about the OS environment, which makes the system very portable. It works well for in-memory databases and slower persistent serialisation, such as SSDs, hard drives, web browser local storage, or even the Git file format.

Irmin is primarily developed and maintained by Tarides, with involvement by contributors from various organisations. External maintainers and contributors are welcome.

* [Features](#Features) * [Documentation](#Documentation) * [Installation](#Installation) * [Prerequisites](#Prerequisites) * [Development Version](#Development-Version) * [Usage](#Usage) * [Example](#Example) * [Command Line](#Commandline) * [Context](#Context) * * [Irmin as a portable and efficient structured key-value store](#Irmin-as-a-portable-and-efficient-structured-keyvalue-store) * [Irmin as a distributed store](#Irmin-as-a-distributed-store) * [Irmin as a dataflow scheduler](#Irmin-as-a-dataflow-scheduler) * [Issues](#Issues) * [License](#License) * [Acknowledgements](#Acknowledgements)

Features

Documentation

API documentation can be found online at https://mirage.github.io/irmin

Installation

Prerequisites

Please ensure to install the minimum opam and ocaml versions. Find the latest version and install instructions on ocaml.org.

To install Irmin with the command-line tool and all Unix backends using opam:

  opam install irmin-cli

A minimal installation containing the reference in-memory backend can be installed by running:

  opam install irmin

The following packages are available on opam:

To install a specific package, simply run:

  opam install <package-name>

Development Version

To install the development version of Irmin in your current opam switch, clone this repository and opam install the packages inside:

  git clone https://github.com/mirage/irmin
  cd irmin/
  opam install .

Usage

Example

Below is a simple example of setting a key and getting the value out of a Git-based, filesystem-backed store.

open Lwt.Syntax

(* Irmin store with string contents *)
module Store = Irmin_git_unix.FS.KV (Irmin.Contents.String)

(* Database configuration *)
let config = Irmin_git.config ~bare:true "/tmp/irmin/test"

(* Commit author *)
let author = "Example <example@example.com>"

(* Commit information *)
let info fmt = Irmin_git_unix.info ~author fmt

let main =
  (* Open the repo *)
  let* repo = Store.Repo.v config in

  (* Load the main branch *)
  let* t = Store.main repo in

  (* Set key "foo/bar" to "testing 123" *)
  let* () =
    Store.set_exn t ~info:(info "Updating foo/bar") [ "foo"; "bar" ]
      "testing 123"
  in

  (* Get key "foo/bar" and print it to stdout *)
  let+ x = Store.get t [ "foo"; "bar" ] in
  Printf.printf "foo/bar => '%s'\n" x

(* Run the program *)
let () = Lwt_main.run main

The example is contained in examples/readme.ml It can be compiled and executed with Dune:

$ dune build examples/readme.exe
$ dune exec examples/readme.exe
foo/bar => 'testing 123'

The examples directory also contains more advanced examples, which can be executed in the same way.

Command Line

The same thing can also be accomplished using irmin, the command-line application installed with irmin-cli, by running:

$ echo "root: ." > irmin.yml
$ irmin init
$ irmin set foo/bar "testing 123"
$ irmin get foo/bar
testing 123

irmin.yml allows for irmin flags to be set on a per-directory basis. You can also set flags globally using $HOME/.irmin/config.yml. Run irmin help irmin.yml for further details.

Also see irmin --help for a list of all commands and either irmin <command> --help or irmin help <command> for more help with a specific command.

Context

Irmin's initial design is directly inspired from XenStore, with:

In 2014, the first release of Irmin was announced as part of the MirageOS 2.0 release. Since then, several projects started using and improving Irmin. These can roughly be split into three categories:

  1. Use Irmin as a portable, structured key-value store (with expressive, mergeable types)
  2. Use Irmin as distributed database (with a customisable consistency semantics)
  3. Use Irmin as an event-driven dataflow engine.

Irmin as a portable and efficient structured key-value store

Irmin as a distributed store

Irmin as a dataflow scheduler

Issues

Feel free to report any issues using the GitHub bugtracker.

License

See the LICENSE file.

Acknowledgements

Development of Irmin was supported in part by the EU FP7 User-Centric Networking project, Grant No. 611001.