amrei-stammann / alpaca

An R-package for fitting glm's with high-dimensional k-way fixed effects
43 stars 6 forks source link

predicted values for bias-corrected model parameters #14

Open gbiele opened 3 years ago

gbiele commented 3 years ago

Hi, Thanks for making this packages available!

I found, by looking at your code, that if one applies a bias correction and then uses predict for the de-biased feglm model, one still gets the predictions from the biased coefficients. It's not hard to get predictions from the de-biased model coefficients. Is there a reason that predict returns predictions inconsistent with de-biased model coefficients?

Cheers - Guido

PS: I'm sure you can write better code than this, but here is code to obtain predictions that match de-biased coefficients:

pred.feglm = 
  function(object, new_data = NULL, type = "link") {
  if (!is.null(new_data)) {
    data = new_data
  } else {
    data = object$data
  }

  family = object[["family"]]
  lnk = ifelse(family[["link"]] == "logit", boot::inv.logit,pnorm)
  FEs = getFEs(object)

  X = model.matrix(formula, data, rhs = 1L)[, - 1L, drop = FALSE]
  eta = coef(object) * X
  for (k in names(FEs)) {
    eta = eta + FEs[[k]][data[,k]]
  }

  if (type == "response") {
    return(lnk(eta))
  } else {
    return(eta)
  }
}
amrei-stammann commented 3 years ago

Hi Guido,

predictions are not based on the de-biased estimator because so far I have not thought about whether this is theoretically justified.

Your code is fine but you should use matrix multiplication to compute your eta.

eta = as.vector(X %*% coef(object))

Best wishes,

Amrei

gbiele commented 3 years ago

Hi Amrei,

I cannot speak to if it is theoretically justified (not 100% sure what this means in this context :-)) to do predictions based on bias-corrected parameters. On the other hand, in the current state, the predictions and parameter estimates in a bias-corrected model object are not consistent, which also does not seem ideal.

You are of course right that I should have used matrix multiplication. (The reason i did not was that I just did a quick test with only one predictor, in which case it did not matter much.)

Cheers - Guido

PS: I don't mind if you close the issue.