HertieDataScience / SyllabusAndLectures

Hertie School of Governance Introduction to Collaborative Social Science Data Analysis
MIT License
37 stars 60 forks source link

Kable Question #64

Closed jasmincantzler closed 8 years ago

jasmincantzler commented 8 years ago

Dear all,

Since some of you are way more familiar with R than I am, I am putting this out here in the hope that someone can restore my sanity. I try to generate a basic summary statistics table. In preparation I want to calculate the mean grouped by a factor variable; so the mean age for example per country variable.

If someone could help me out I would appreciate it greatly. R and I are not yet friends :(

(just in case you want to have a look: https://github.com/jasmincantzler/TaxMorale/blob/master/Assignment%203/Data%20Analysis.R)

mcallaghan commented 8 years ago

You want to do something like

library(dplyr)
summary_df <- df %>%
  group_by(country) %>%
  summarise(
    mean = mean(age,na.rm=TRUE)
  )
jasmincantzler commented 8 years ago

It says I cannot group by "country" because its a factor:

Fehler in UseMethod("groupby") : nicht anwendbare Methode für 'groupby' auf Objekt der Klasse "factor" angewendet

mcallaghan commented 8 years ago

Oh, weird. What about group_by(as.character(country))?

jasmincantzler commented 8 years ago

Nope :( group_by and as.character don't want to go together (group_by(as.character(mydata$Country)))

Fehler in UseMethod("groupby") : nicht anwendbare Methode für 'groupby' auf Objekt der Klasse "character" angewendet

mcallaghan commented 8 years ago

Ah, I have already supplied the dataframe as the first argument to dplyr's group by function by piping it (%>%), so you don't need mydata$Country, just Country.

jasmincantzler commented 8 years ago

THANK YOU!