JeffreyRacine / R-Package-np

R package np (Nonparametric Kernel Smoothing Methods for Mixed Data Types)
https://socialsciences.mcmaster.ca/people/racinej
46 stars 18 forks source link

uocquantile for subset of factor #7

Open AntonioFidalgo opened 9 years ago

AntonioFidalgo commented 9 years ago

Hi,

The uocquantile function does not seem to work optimally when applied to a subset of a factor that excludes the mode of the whole factor. The following illustrates this issue.

library(np) fruit.l <- as.factor(c("a","n","a","n","a","s")) uocquantile(fruit.l,.5)

[1] a

Levels: a n s

uocquantile(fruit.l[fruit.l!="a"],.5)

[1] s

Levels: a n s

This seems to happen because uocquantile uses first table(x) and then unique(x). These might have different lengths. Indeed, table(x) considers all the levels of the whole factor before subsetting while unique(x) does not.

The following alternative mode functions for factor x avoid this issue.

mode1 <- function(x) { ux <- unique(x) ux[which.max(tabulate(match(x, ux)))] } mode2<-function(x){ tq <- unclass(table(x)) j <- which(tq == max(tq))[1] factor(sort(unique(levels(x)))[j], levels=levels(x)) }

Thanks!