Open CassKon opened 6 years ago
Peer Review:
Report:
geom_boxplot()
.ddply
seems to be a very convenient function. It would be nicer if you call summaryStats and show the output in the report.Scripts and makefile:
gapminderData<-read.delim("gapminder.tsv")
in the 01_filter_reorder script is more intuitive than putting it in the download-data script, just to make things more coherent.Hello @CassKon , Here are some comments on your hw07. Hope it helps!
You totally understood the homework topic, how to automate data-analysis pipeline.
In your 01_filter_reorder.R
file, you could use geom_boxplot()
instead of stat_summary()
. geom_boxplot()
shows min, 1q, median, 3q, max (it does not show mean value). I guess you wanted to show the mean values because you reordered continent on the mean values of lifeExp and might have wanted to compare this with mean values of gdpPercap.
I like the way you analyzed by using life expectancy scaled by population. Summary statistics, some descriptions, or title would supplement the plots.
In your 02_aggregate.R
file, I didn't know the function ddply()
. It was good to know! You combined this with a customized function to print out all the regression results for each country.
It seems like your Best
, Worst
, and Top
arranged by slope
value are all same. I think you probably forgot to delete Top
and desc()
You did a good job with making customized functions!
I think it would be better if you could facet by less number of countries or just by continent.
Your Makefile.R
totally works well automating a data-analytical pipeline. It would be good if you could try an alternative way using Makefile
to run the pipeline.
It would be a little bit better if you could include Rmd
file we usually use for assignments and see how it works and renders RMarkdown report
.
Overall, I think you understood how to automate the pipeline and provided some analysis. You did a good job! Hope this comment helps.
Here is my homework 7!