Closed ghost closed 7 years ago
Hello @Jennica416 ,
Smell test the data You answered all the questions and got the number of variables and observations in more than one way. But I am confused about your answer to Question 2. You code is inconsistent with your conclusion. How did you know gapminder's class is "tbl_df".
Explore individual variables
You explored one categorical variable and one quantitative variable. You used summary()
to get your conclusion and visualized it with figures.
Explore various plot types You explored a scatterplot of two quantitative variables, a plot of one quantitative variable, a plot of one quantitative variable and one categorical variable using different types of plots.
Use filter()
, select()
and %>%
You used filter()
and %>%
, but did not use select()
.
Bonus
You used more of the dplyr functions such as group_by()
, summarize()
, and arrange()
.
Report your process You did not report your process.
Overall, you did a good job. Your repository is well organized. The markdown file is easy to find. You explored several plot types. You used more dplyr functions such as group_by()
, summarize()
, and arrange()
.
Hi @Jennica416,
This is my peer review for your Hw02. At first, I found that your
README.md
file doesn't provide the respective links to themd
andRmd
files. In terms of the assignment tasks, you properly answered the questions posed in section Smell Test the Data. There's another way of getting the number of rows and columns in a given data frame (functiondim()
does the trick all at once, by providing a vector of two elements - number of rows and number of columns). I liked that you provided a concise description on how functionsstr()
andsummary()
work; I have to do that more often, thanks!For section Explore Individual Variables,
summary()
for the categorical variablecontinent
provides the number of yearly records percontinent
, which is pretty useful. You can also get single occurrences by using the following chunk of code:gapminder %>% group_by(continent) %>% summarize(n = n_distinct(country))
This information could give you useful hints when checking the spread in your data by
continent
. I certainly liked that you provided concise interpretations for your plots in this section, which is something advisable all the time.For section Explore Various Plot Types, I recommend to use side-by-side boxplots instead of scatterplots when one of the variables is categorical (as in
year
versuslifeExp
), this will give you a better idea on the data spread and the observations percontinent
won't be overlapped. If you want to use only points and lines, you could opt for yearly averages percontinent
.In terms of faceting, you could have included some interpretations for the scatterplots of
gdpPercap
versuslifeExp
. By the way, that plot looks super cool!Good job!
Cheers,
Alexi