UBOdin / mimir

Data-ish exploration through SQL+Uncertainty
http://mimirdb.info
Apache License 2.0
27 stars 13 forks source link
analytics data-wrangling database mimir probabilistic-database probabilistic-programming

The Mimir Data-Ish Exploration Tool

One of the biggest costs in analytics is data wrangling: getting your messy, mis-labeled, disorganized data into a format that you can actually ask questions about. Unfortunately, most tools for data wrangling force you to do all of this work upfront — before you actually know what you even want to do with the data.

Mimir is about getting you to your analysis as fast as possible. It lets you harness the raw power of SQL, but also provides a ton of powerful langauge extensions:

Unlike most other SQL-based systems, Mimir lets you make decisions during and after data exploration. All of Mimir's functionality is based on three ideas: (1) Mimir provides sensible best guess defaults, and (2) Mimir warns you when one of its guesses is going to affect what it's telling you, and (3) Mimir lets you easily inspect what it's doing to your data with the ANALYZE query command.

Better still, you don't need any new infrastructure. Mimir attaches to ordinary relational databases through JDBC (We currently support SQLite, with SparkSQL and Oracle support in progress). If you don't care, Mimir just puts everything in a super portable SQLite database by default.

Quick-Start

Install with Homebrew

$> brew tap UBOdin/ubodin
$> brew install mimir
$> mimir --help

Manually download the JAR

Download the latest version of Mimir:

This is a self-contained jar. Run it with

$> java -jar Mimir.jar

Run with Docker

Install Docker and run the docker image:

$> docker run -i -t docker.mimirdb.info/mimir-core
...

Link with SBT (or Maven)

Add the following to your build.sbt

resolvers += "MimirDB" at "http://maven.mimirdb.info/"
libraryDependencies += "info.mimirdb" %% "mimir" % "0.2-SNAPSHOT"

User Guides

Mimir adds some useful language features to SQL. See the MimirSQL Docs for more details, as well as the Lens and Adaptive Schema Docs for more information about Mimir's data cleaning components.

Compiling Mimir

To compile from source, check out Mimir, and use one of the following to compile and run mimir.

$> git clone https://github.com/UBOdin/mimir.git
...
$> cd mimir
$> sbt run

OR

$> sbt assembly
...
$> ./bin/mimir

OR Install Docker and use the docker image:

$> docker run -i -t docker.mimirdb.info/mimir-core
...

Hacking on Mimir

Credit

Development of Mimir has been sponsored by