I just wanted to expand on a note on why you lost marks for writing "log(GDPpercap)" in the titles of your plots. I only took -1 off even though it's present in both plots, because I think it is an easy mistake to make.
"log(GDPpercap)" in your axis title suggests that the axis labels are in logged values. For example, $1000 = log(GDPpercap), when this isn't the case. The scale itself is logged, but the original non-logged values are the axis tick labels. Basically, a more accurate (correct) title would be just "GDP per capita ....", and then I would add in text somewhere a note that scale is logged.
most data points are concentrated at the lower 75% quantile for Africa,
By definition of a quantile, and boxplots, exactly 75% of all data points are below the 75% quantile for every continent.
Your ggplot2 code has some redundancy:
# too verbose
ggplot(aes(x=year, y=mean.lifeexp, group=continent)) +
geom_line(aes(color=continent))+
geom_point(aes(color=continent))
# aesthetics transfer from ggplot layer to other layers
ggplot(aes(x=year, y=mean.lifeexp, color=continent)) +
geom_line()+
geom_point()
Hi Jingyiran,
Great work!
I just wanted to expand on a note on why you lost marks for writing "log(GDPpercap)" in the titles of your plots. I only took -1 off even though it's present in both plots, because I think it is an easy mistake to make.
"log(GDPpercap)" in your axis title suggests that the axis labels are in logged values. For example, $1000 = log(GDPpercap), when this isn't the case. The scale itself is logged, but the original non-logged values are the axis tick labels. Basically, a more accurate (correct) title would be just "GDP per capita ....", and then I would add in text somewhere a note that scale is logged.
By definition of a quantile, and boxplots, exactly 75% of all data points are below the 75% quantile for every continent.
Your
ggplot2
code has some redundancy: