KateJohnson / STAT545-hw-Johnson-Kate

0 stars 0 forks source link

Homework 004 is reading for grading #3

Open KateJohnson opened 6 years ago

mylinhthibodeau commented 6 years ago

Dear @KateJohnson,

Excellent homework 4!

Peer review

I really liked your homework, as it was very tidy, and if I could make one small recommendation, it would be to include some of the "failed attempts" you made to illustrate clearly what has been tried so far to resolve the challenges, which can be helpful to generate hypotheses.

Additional information

  1. I had the same problem than you trying to label groups on the spread version of my data, but I concluded that the most efficient way was to use the long version of the table.

For example, in your case, this code would label the country appropriately in a different colour and the legend would accompany your graph:

ggplot(gap.long, aes(x=year, y= lifeExp, colour = country)) + 
  geom_point() +
      xlab("Year") + ylab("Life Expectancy") + theme_bw() 

If you found out how to label by "columns" (country) using your gap.spread table format, please do let me know !!

  1. I tried to look into how to match columns according to a substring, but did not succeed.

I was able to solve the "Congo" problem using the fuzzyjoin package, as exemplified here, because "Congo" (conco dataset) is a substring of "Congo, Dem. Rep." (gap.le dataset)

library(fuzzyjoin)
conco_gap.le <- gap.le %>% regex_left_join(conco, by = "country")
View(conco_gap.le)
dim(conco_gap.le)

However, it created a new problem: "United States of America" (conco dataset) is not a substring of "United States" (gap.le dataset), it did not appropriately match for United States :( Also, Guinea and Equatorial Guinea had two different iso.3c and it duplicated the data entries for Guinea in the gap.le table.

Once again, if you do find the solution for this, please pretty please let me know, because I would really like to know how to solve this problem.

Warm regards, My Linh Thibodeau

abishekarun commented 6 years ago

Peer Review:

Hi, @KateJohnson ! You did an excellent homework and went ahead and explored different dataset(country code).

Data reshaping

Data joining

Some suggestions

The fact that you had mentioned the struggles that you had experienced is very helpful and useful for other students. It was also nice to see you mention the resources that helped you.

Overall, I think your homework was really well done and hope you can keep it up!

Regards, Arun Rajendran

derekcho commented 6 years ago

Hi @KateJohnson, here are some comments about your hw04:

Your grade will be emailed to you at a later date.