dsy109 / mixtools

Tools for Analyzing Finite Mixture Models
18 stars 4 forks source link

Predicting components for new data? #1

Open mschubert opened 3 years ago

mschubert commented 3 years ago

Hi,

Thank you for this useful package!

I'm using mixtools to fit a Normal mixture model similar to the vignette example:

library(mixtools)
data(faithful)
attach(faithful)

wait1 = normalmixEM(waiting, mu=c(50, 70))

head(wait1$posterior, 2)
#              comp.1       comp.2
#   [1,] 1.023874e-04 9.998976e-01
#   [2,] 9.999089e-01 9.109251e-05

Is there a way to predict the posterior component likelihood for new data (analogous to e.g. predict on an lm model)?

predict(wait1, newdata=data.frame(waiting=c(42,44,61)))
# Error in UseMethod("predict") :
#   no applicable method for 'predict' applied to an object of class "mixEM"
mod = lm(eruptions ~ waiting, data=faithful)
predict(mod, newdata=data.frame(waiting=c(45,77)))
dsy109 commented 3 years ago

I am happy to hear that mixtools has been useful for your research, @mschubert!

Unfortunately, we have not yet written an S3 method for prediction on an object of class mixEM. With that said, the following will accomplish what you are asking, at least for the example that you provided:

library(mixtools)
data(faithful)
attach(faithful)

set.seed(1)

wait1 <- normalmixEM(waiting, mu = c(50, 70))

#Assuming newdata is a numeric vector
pred.fn <- function(EMout, newdata){
  out <- t(sapply(1:length(newdata), function(i) EMout$lambda*dnorm(newdata[i], mean = EMout$mu, sd = EMout$sigma)))
  out <- out/apply(out, 1, sum)
  rownames(out) <- newdata
  colnames(out) <- c(paste("comp", ".", 1:length(EMout$lambda), sep = ""))
  return(out)
}

pred.fn(wait1, newdata = c(42, 44, 61))

#      comp.1       comp.2
#42 1.0000000 1.259483e-08
#44 0.9999999 5.536425e-08
#61 0.9841608 1.583920e-02

Note in the above that I have assumed that the input for newdata is a numeric vector and not a data frame.

mschubert commented 3 years ago

Thank you very much for your quick answer! Your code example helps me to solve my immediate task.

For myself (and probably others) it would be great if you could also implement a predict function in mixtools eventually.

drh20drh20 commented 3 years ago

Thanks @mschubert for the question and suggestion. @dsy109 , maybe it would be good to add some S3 methods to the package as suggested. Something to keep in mind!

dsy109 commented 3 years ago

Glad it worked for your immediate task, @mschubert.

@drh20drh20, it will be good to add such an S3 method for predict. I will add it to my list. Note that there are currently some S3 methods, namely for density, print, and summary.

drh20drh20 commented 3 years ago

@dsy109 I may have access to an undergraduate researcher. Do you want me to put them on it? I'm also happy to defer to you; your call.

dsy109 commented 3 years ago

@drh20drh20 I am fine with that arrangement. @Kedai-Cheng is working on a laundry list of items for mixtools updates, but he is in the midst of doing major overhauling of the graphics. So wrapping an undergraduate researcher in for this would be helpful.

drh20drh20 commented 3 years ago

Wonderful. I will try to get the ball rolling on it and let you know if I have any issues.

drh20drh20 commented 3 years ago

Status update: I have an undergraduate working on this; we don't have an estimated completion date.