duttashi / learnr

Exploratory, Inferential and Predictive data analysis. Feel free to show your :heart: by giving a star :star:
MIT License
78 stars 55 forks source link

How to group factor levels? #63

Closed duttashi closed 4 years ago

duttashi commented 4 years ago

This Q was originally asked on [SO](). I'm reproducing it here for referencing purpose:

Suppose a dataset has a factor column with values like;

> mydata                    
   question id           value
1         1  1      not likely
2         2  1      not likely
3         3  1      not likely
4         4  1      not likely
5         5  1 slightly likely
6         1  2     very likely
7         2  2 slightly likely
8         3  2 slightly likely
9         4  2      not likely
10        5  2     very likely

So how do I group the factor levels for variable value into say two levels ?

duttashi commented 4 years ago

Solution

Assign the groups to a list.

set.seed(42)
# toy data
mydata <- transform(expand.grid(question=1:5, id=1:5),
                    value=factor(sample(1:4, 25, rep=T), 
                                 labels=c("not likely", "slightly likely", 
                                          "likely", "very likely")
                                 )

                    )
> str(mydata)
'data.frame':   25 obs. of  3 variables:
 $ question: int  1 2 3 4 5 1 2 3 4 5 ...
 $ id      : int  1 1 1 1 1 2 2 2 2 2 ...
 $ value   : Factor w/ 4 levels "not likely","slightly likely",..: 1 1 1 1 2 4 2 2 1 4 ...
> levels(mydata$value)
[1] "not likely"      "slightly likely" "likely"          "very likely" 
# group levels and assign to list
levels(mydata$value) <- list("unlikely"=c("not likely", "slightly likely"),
                             "likely"=c("likely", "very likely"))
levels(mydata$value)
> levels(mydata$value)
[1] "unlikely" "likely"