ttimbers / data_analysis_pipeline_eg-archive

Other
1 stars 23 forks source link

Building a Data Analysis pipeline tutorial

adapted from Software Carpentry

This example data analysis project analyzes the word count for all words in 4 novels. It reports the top 10 most occurring words in each book in a report.

Usage:

There are two suggested ways to run this analysis:

1. Using Docker

note - the instructions in this section also depends on running this in a unix shell, if you are using Windows Command Prompt, replace $(pwd) with PATH_ON_YOUR_COMPUTER.

  1. Install Docker
  2. Download/clone this repository
  3. Use the command line to navigate to the root of this downloaded/cloned repo
  4. Type the following:
docker run --rm -v $(pwd):/home/rstudio/data_analysis_eg ttimbers/data_analysis_pipeline_eg make -C /home/rstudio/data_analysis_eg all

2. After installing all dependencies (does not depend on Docker)

  1. Clone this repo, and using the command line, navigate to the root of this project.
  2. To run the analysis, type the following commands:
make all
  1. To reset/undo the analysis, type the following commands:
make clean

Depenedencies

The tutorials for this example can be found here: