Per our discussion during today's lab, please see below the preliminary division of work for this milestone:
Please see Tiffany's requirements here. We need to have five scripts in total:
(1) A first script that downloads some data from the internet and saves it locally. We already have it from the last milestone.
(2) A second script that reads the data from the first script and performs and data cleaning/pre-processing, transforming, and/or partitioning that needs to happen before exploratory data analysis or modeling takes place.
(3) A third script which creates exploratory data visualization(s) and table(s) that are useful to help the reader/consumer understand that dataset.
(4) A fourth script that reads the data from the second script, performs some statistical or machine learning analysis and summarizes the results as a figure(s) and a table(s).
(5) A fifth script: an .Rmd or .ipynb file that presents the key useful (not all!) exploratory data analysis as well as the statistical summaries and figures in a little report.
Division of work:
Mark: (1), (2) and (3)
Jiacheng: (4)
Josh and Elina: (5)
Mark will finish the drafts for (1), (2) and (3) by Thursday morning Vancouver time, so that other team-members can carry on with their parts.
HI everyone, I have created a pull request. Please review and accept. You should be able to conduct analyses on the new winequality-train.csv file under the data folder.
Hello everyone,
Per our discussion during today's lab, please see below the preliminary division of work for this milestone:
Please see Tiffany's requirements here. We need to have five scripts in total:
(1) A first script that downloads some data from the internet and saves it locally. We already have it from the last milestone. (2) A second script that reads the data from the first script and performs and data cleaning/pre-processing, transforming, and/or partitioning that needs to happen before exploratory data analysis or modeling takes place. (3) A third script which creates exploratory data visualization(s) and table(s) that are useful to help the reader/consumer understand that dataset. (4) A fourth script that reads the data from the second script, performs some statistical or machine learning analysis and summarizes the results as a figure(s) and a table(s). (5) A fifth script: an
.Rmd
or.ipynb
file that presents the key useful (not all!) exploratory data analysis as well as the statistical summaries and figures in a little report.Division of work:
Mark: (1), (2) and (3) Jiacheng: (4) Josh and Elina: (5)
Mark will finish the drafts for (1), (2) and (3) by Thursday morning Vancouver time, so that other team-members can carry on with their parts.
Please reply with what you think.