alexanderjfink / miner

Script for downloading, unpacking, and converting public and open data into your preferred local format
http://alexanderjfink.github.io/miner
GNU General Public License v2.0
4 stars 2 forks source link

Stories in Ready miner

Build Status

Script for downloading, unpacking, and converting online and public (and others to be added soon) datafiles to open format

The goal of this app is to develop a bash interface, modeled after homebrew that will allow anyone free and open access to data that is available on the web. The hope is to make it possible to do three things easily:

  1. Search sources of data available online
  2. Make data easy to download and enter into a database (of your choice) into a common format
  3. Liberate public data by making it open data on your computer (you can use it in any format rather than proprietary formats, e.g. the US Census, which uses Access)

installation

If you have pip installed

$pip install miner

Eventually I will add support for...

homebrew and R package manager

usage

Please note - miner is in pre-alpha and these commands are meant to serve only as a preview for future functionality (chances aregood they are currently buggy)

Searching

$miner search (or dig) <dataset name> [OPTIONAL: subset name]

Example

$miner search (or dig) uscensus2010 $miner search (or dig) minnesotapublicschools

Describing (a dataset)

$miner describe (or assay) <dataset name> [OPTIONAL: subset name]

Example

$miner describe (or assay) uscensus2010 $miner describe (or assay) uscensus2010 nd

Installing

#miner install (or extract) <dataset name> [OPTIONAL: subset name]

Example

$miner install (or extract) uscensus2010 $miner install (or extract) uscensus2010 mn

development

Testing: $nosetests

miner & dat

What is the difference between miner and dat? We don't miner as a competitor to dat. Rather, miner is a parallel and complementary project. Here are what we see as differences:

One way to look at it is that miner exists given today's non-standards-based, mixed license, individually/organizationally hosted dataset world. dat could be seen as the forerunner of the open knowledge / open data world.