Jennica416 / STAT545-hw-nichols-jennica

Stats 545 Homework
0 stars 0 forks source link

Hw02 - Peer Review #4

Closed ghost closed 7 years ago

ghost commented 7 years ago

Hi @Jennica416,

This is my peer review for your Hw02. At first, I found that your README.md file doesn't provide the respective links to the md and Rmd files. In terms of the assignment tasks, you properly answered the questions posed in section Smell Test the Data. There's another way of getting the number of rows and columns in a given data frame (function dim() does the trick all at once, by providing a vector of two elements - number of rows and number of columns). I liked that you provided a concise description on how functions str() and summary() work; I have to do that more often, thanks!

For section Explore Individual Variables, summary() for the categorical variable continent provides the number of yearly records per continent, which is pretty useful. You can also get single occurrences by using the following chunk of code:

gapminder %>% group_by(continent) %>% summarize(n = n_distinct(country))

This information could give you useful hints when checking the spread in your data by continent. I certainly liked that you provided concise interpretations for your plots in this section, which is something advisable all the time.

For section Explore Various Plot Types, I recommend to use side-by-side boxplots instead of scatterplots when one of the variables is categorical (as in year versus lifeExp), this will give you a better idea on the data spread and the observations per continent won't be overlapped. If you want to use only points and lines, you could opt for yearly averages per continent.

In terms of faceting, you could have included some interpretations for the scatterplots of gdpPercap versus lifeExp. By the way, that plot looks super cool!

Good job!

Cheers,

Alexi

mynamedaike commented 7 years ago

Hello @Jennica416 ,

  1. Smell test the data You answered all the questions and got the number of variables and observations in more than one way. But I am confused about your answer to Question 2. You code is inconsistent with your conclusion. How did you know gapminder's class is "tbl_df".

  2. Explore individual variables You explored one categorical variable and one quantitative variable. You used summary() to get your conclusion and visualized it with figures.

  3. Explore various plot types You explored a scatterplot of two quantitative variables, a plot of one quantitative variable, a plot of one quantitative variable and one categorical variable using different types of plots.

  4. Use filter(), select() and %>% You used filter() and %>%, but did not use select().

  5. Bonus You used more of the dplyr functions such as group_by(), summarize(), and arrange().

  6. Report your process You did not report your process.

Overall, you did a good job. Your repository is well organized. The markdown file is easy to find. You explored several plot types. You used more dplyr functions such as group_by(), summarize(), and arrange().