STAT545-UBC / Discussion

Public discussion
38 stars 20 forks source link

Group_by() and summarize() delete countries from figure #382

Open theresap opened 7 years ago

theresap commented 7 years ago

Hi,

my aim for hw05 was to select only European countries from the Gapminder data set, then reorder them by their minimum life expectancy, and make a figure of the rearranged data.

I used the following code to first inspect the unordered countries and their minimum life expectancy:

library(gapminder)
library(tidyverse)
library(forcats)  
gapE <- gapminder %>% 
  filter(continent == "Europe")
levels(gapE$continent) <- gapE$continent %>% 
  fct_drop()
levels(gapE$country) <- gapE$country %>% 
  fct_drop()
levels(gapE$country)
str(gapE$country)

gapE <- gapE %>% 
  group_by(country) %>% 
  summarize(min_lifeExp = min(lifeExp)) %>% 
  ungroup()

ggplot(gapE, aes(x = min_lifeExp, y = country)) +
  geom_point()

However, the new dataframe (gapE) somehow only returns 11 out of the 30 European countries. The same thing happens with the Americas - then it only returns 11 out of 25 American countries.

I would be grateful about any advice on what is causing this trouble :-).

Thanks Theresa

theresap commented 7 years ago

Update: By now, I figured out that somehow the fct_drop function was responsible for that happening. I still can't figure out why that happened though..

ghost commented 7 years ago

Hi,

Try:

gapE$continent <- gapE$continent %>% 
fct_drop()
gapE$country <- gapE$country %>% 
 fct_drop()
theresap commented 7 years ago

Yay, this works! Thank you.

jennybc commented 7 years ago

@philstraforelli Yes, thanks! I was about to say ... use fct_drop() to operate on the factor itself, not on its levels.