UBC-MDS / DSCI_522_Group304

DSCI 522 Group 304 Project - Are There Differences in FSA Scores Between Subgroups?
MIT License
0 stars 5 forks source link

Milestone 2: Script 2 (Data Cleaning / Pre-processing etc) #30

Closed annychih closed 4 years ago

annychih commented 4 years ago

From Milestone 2:

  1. A second script that reads the data from the first script and performs and data cleaning/pre-processing, transforming, and/or paritionting that needs to happen before exploratory data analysis or modeling takes place. This should take at least two arguments:
    • a path/filename pointing to the data to be read in
    • a path/filename pointing to where the cleaned/processed/transformed/paritioned data should live
annychih commented 4 years ago

Since we need to have at least one script in Python instead of R (Milestone 2 instructions say we need to use both languages) and we have to edit the data cleaning process a bit anyway, I think this is the script we should write in Python.

annychih commented 4 years ago

This script is done and lives in the src folder as clean_data.py. It satisfies all the requirements noted above.