Notebook Pipeline Sandbox

Data analysis is all about first playing with some data. To emulate this we want to provide the user with a notebook, and some sample data that is generated.

Usage

Ideally the user should beable at anytime to run something like

$ lernspark-play

This should open up a Juypter notebook that is formatted with some cells. However lernspark-play should also generate a small subset of data. This data will be local on the machine and not from the cloud.

After the user has played around a bunch the ideas is we want them to save the pipeline.sql file from their session. It is this pipeline.sql file that the entire apachespark containter application will be based off of.

Requirements

lernspark-play.sh
sandbox.ipynb
data/[main|sql|model].rs

smohler / lernspark

Sandbox Notebook Pipeline Designer #12

Notebook Pipeline Sandbox

Usage

Requirements