jacobbien / hierNet

A Lasso for Hierarchical Interactions
4 stars 3 forks source link

Reconstruction gives different results than predict() #3

Closed marastadler closed 3 years ago

marastadler commented 3 years ago

Hi Jacob,

This package is super helpful :-)!

I'm using hiernet(x = model.matrix( ~ -1 + ., as.data.frame(X)), y = y, diagonal = FALSE, strong = TRUE) without intercept to analyse my proteomics data. When trying to reconstruct the output y on my own after fitting the model, I'm getting slightly different results than with predict.hierNet(object = fit, newx = X, newzz = compute.interactions.c(X, diagonal = F)). What I'm doing is the following: theta <- (fit$th + t(fit$th))/2 interact_p <- theta[lower.tri(theta)] coef_p <- c(fit$bp + (-fit$bn), interact_p)

Y_hat <- cbind(X, compute.interaction.c(X, diagonal = F)) %*% coef_p

Do you have any idea what the difference is? (The compute_yhat_zz function did not help me figure out the difference.)

Thanks a lot! Best, Mara

jacobbien commented 3 years ago

Hi Mara, glad to hear the package is useful for you!

In practice, I'd definitely recommend using hierNet's predict function, as in predict(fit, newx = X). However, I understand that you still might like to try it yourself to make sure it's doing what you think it is. Below is a modification of your code that gives an exact match to the output of hierNet's predict function. The key is that whatever centering/scaling happened at the time of training also has to happen at the time of prediction:

library(hierNet)
set.seed(123)
n <- 100
p <- 10
X <- matrix(rnorm(n * p), n, p)
y <- rnorm(n)
diag <- FALSE
fit <- hierNet(x = X, y = y, lam = 2, diagonal = diag, strong = TRUE) 
yhat <- predict(fit, newx = X)

# standardize the new X in the same way that it was in training:
stand_X <- scale(X, center = fit$mx, scale = fit$sx)

# use the scaled X to form the interactions, and then center the interactions
# in the same way they were in training:
zz <- scale(compute.interactions.c(stand_X, diagonal = diag),
            center = fit$mzz, 
            scale = FALSE)

# this part is what you had:
theta <- (fit$th + t(fit$th))/2
interact_p <- theta[lower.tri(theta, diag = diag)]
coef_p <- c(fit$bp - fit$bn, interact_p)

# add fit$my to predictions to account for the internal centering:
Y_hat <- fit$my + cbind(stand_X, zz) %*% coef_p

range(yhat - Y_hat)

One more thing I should clarify: Your model.matrix( ~ -1 + ., as.data.frame(X)) code is not preventing hierNet from using an intercept. In fact, hierNet always uses an intercept.

marastadler commented 3 years ago

Hi Jacob, Many thanks for the detailed answer and the code example!