datopian / data-cli

data - command line tool for working with data, Data Packages and the DataHub
http://datahub.io/docs/features/data-cli
64 stars 8 forks source link

[epic] MVP in JS #1

Closed rufuspollock closed 7 years ago

rufuspollock commented 7 years ago

Minimal nodejs based datahub-cli

Command Line

The command line is our MVP. It is also the primary tool for our primary audience of power data wranglers. The command line is power, speed, and simplicity (for those familiar with it).

Approach: mechanisms over policies => push and get from local files (or even stdin and stdout) without even need for data packaging locally.

Concepts

Sketch of Command Line interface


# push current directory (must have datapackage.json)
data push

# retrieve to current directory ()
data get publisher/package

data config

## ======================
# extras

# ls my packages (for my default publisher)
data ls

data info {pkgid}

# user and default publisher
data whoami

## ======================
## data wrangling stuff (later)

# see datapipes
data pipe cut {file} 

Acceptance Criteria

Tasks

Prep and Planning

Implementation

Nice to have:


Appendix

JS vs Python - SCQA

What's missing in the current command line

Situation: we have a python-based CLI that works

Complication: the python CLI does not cover these new features and has certain drawbacks like being hard to install reliably cross-platform

Question: should we extend the python CLI or build a new CLI using JS

Why use JS:

Old Analysis

## push ==============

# stdin - to your scratchpad data package
cat xyz.csv | datahub push

# stdin to data package
cat xyz.csv | datahub push my-data-package

# explicit path (less important!)
datahub push xyz.csv my-data-package

## get  ==============
# by default simply outputs to stdin
datahub get {publisher}/{dataset}/{name}[.extension]

# (can also download to current directory (or even to stdin)
datahub get {publisher}/{dataset}/{name}[.extension] {dir or path}

Comments:

Config wanted:

username
token
server
rufuspollock commented 7 years ago

FIXED. Always more to do but i think this can be considered done in the last sprint as all core commands implemented.