sfbrigade / datasci-sba

Solving problems with the Small Business Administration
10 stars 18 forks source link

Running pipeline locally #40

Closed VincentLa closed 7 years ago

VincentLa commented 7 years ago

This PR does two main things:

  1. Cleans up some of the parser code so that instead of absolute paths pointing to raw data files, use relative paths. This will make it easier to run on any machine (instead of just my own)
  2. Added a lot more documentation on how to run the pipeline locally, and how to set up your local DB with production data
VincentLa commented 7 years ago

@avdonovan Added a lot more documentation here on getting set up locally and running the pipeline_runner.py

VincentLa commented 7 years ago

Ok this is ready to go now I think.

makfan64 commented 7 years ago

Looks good to me. Any comments from @avdonovan ? EDIT: oops, when I refreshed, there they were!

VincentLa14 commented 7 years ago

@avdonovan I reviewed your changes those look good to me. In addition, I added in a run_parse argument to the command line interface for the pipeline_runner. I think this addresses your concerns about ad-hoc commenting out of files and such.

makfan64 commented 7 years ago

I like the change to pipeline_runner.

VincentLa14 commented 7 years ago

Thanks all!