benbhansen-stats / propertee

Prognostic Regression Offsets with Propagation of ERrors, for Treatment Effect Estimation (IES R305D210029).
https://benbhansen-stats.github.io/propertee/
Other
2 stars 0 forks source link

sharpen cov_adj()'s "partial residuals not implemented" warning #184

Closed benthestatistician closed 1 week ago

benthestatistician commented 2 months ago

cov_adj() has a warning for the circumstance that the user gives a covariance model involving the treatment variable, but as recorded in the Design it's not the sort of treatment variable that the function is equipped to partial out. E.g.,

Warning message: In propertee::cov_adj(cmod, design = des, newdata = as.data.frame(Q_df), : The treatment variable is in the covariance adjustment model, and is neither logical or numeric; for now, partial residuals only implemented for logical or numeric treatments

Problem is, the trigger conditions for this warning don't currently include a check for whether this variable actually contributes to the covariance model; see cov_adj.R#L64-79. I hit it in a case with a covariance model without the treatment variable, and was quite confused.

  1. Let's add such a check. Or perhaps adjust the warning, if adjusting the conditions turns out to be unduly tricky.
  2. From cov_adj()'s perspective, it seems that we prefer that treatments be encoded in Design's as factor rather than character variables. (Of course logicals are better yet, but when there are more than 2 categories, with factors we get to follow the convention of taking factor level 1 to be the control condition, whereas with characters we don't have anything to identify that condition with.) So let's adjust our warning to indicate to the user that for cov_adj()'s purposes it's safer to encode their character treatment variables as factors.
benthestatistician commented 2 months ago

Could you take a look, @josherrickson? At your convenience. I've tentatively assigned you. Part 2 relates to #78, and we can discuss it before moving ahead with it, if you think that's appropriate.

josherrickson commented 1 week ago

I think it's just the text of the warning that's problematic? The point of https://github.com/benbhansen-stats/propertee/blob/08fb12d1fcf77559ad9fd45bb686933850b638ea/R/cov_adj.R#L64-L79 is to set the treatment variable to it's reference level before going into the PreSandwichLayer. If the treatment variable is something we don't know how to deal with, we want to let users know before moving forward, correct?

If so, then we can just simplify the warning to:

The treatment variable is in the covariance adjustment model, and is neither logical or numeric; for now, partial residuals only implemented for logical or numeric treatments

benthestatistician commented 1 week ago

I think that's a good economy-class solution if we add something to the warning to note that it doesn't matter if the treatment variable isn't an independent variable of the covariance model. Something like:

If treatment is an independent variable of the covariance model, predictions may (incorrectly) include treatment contributions. For now, setting these contributions to 0 is implemented only for logical or numeric treatments, but this treatment is a factor.