Closed josherrickson closed 1 year ago
Subgrouping variables are the main thing we're focused on, but I don't know that we should exclude other types of moderating variables, so to speak. (If eventually decide that we should, I might sooner enforce it by erroring on numeric right hand side variables, rather than converting them.)
Can we readily describe what gets done with a number right hand side variable? I'd guess that as things now stand, for numeric x
lmitt(y ~ x, design = des)
gets you the result of:
x
and the intercept into the assignment variable, giving $\tilde{z}$y
on $\tilde{z}$, with no interceptSound right? If so, then one use of this sort of thing would be in estimating interactions of the treatment effect with a Peters-Belson prediction.
Since we don't have that residualization implemented yet (#59), lmitt(y ~ x, ...)
currently just fits lm(y ~ assigned() + assigned():x, ...)
which then reports a single assigned():x
coefficient.
Once that residualization is implemented, we can choose how we want it to work of course, but if we didn't make special cases, what you described would basically work, albeit with perhaps a different/unclear name in step 3.
Users have to deal with a similar thing when they input a numeric variable to lm
that they actually mean to be categorical. When they have a binary subgroup, using a numeric variable works fine and the summary
and coefficients
of their model are what they expect. When their numeric subgroup variable is non-binary though and they see their model only has one coefficient, they'll realize they needed to call as.factor
. I think our package aligning with that typical process makes sense, rather than potentially creating other issues by forcing someone to make a binary numeric variable a factor.
Per discussion: If a user passes a continuous sbgrp, return coefficients on both assigned()
and assigned():sbgrp
.
I implemented continuous "subgroup" variables returning a main effect + interaction in a branch; if no one wakes up in a cold sweat because this is the wrong approach, I'll merge it into main in a few days.
Bumping this for myself - I lost track of this branch. This is coming up in reference to #128 and an offline discussion Ben and I had.
Continuous moderators are now supported. @jwasserman2 @xinhew0708 Note that the model changes. With a categorical moderator x
, we fit:
y ~ 1 + x + x:treatment
With a continuous moderator x
, we fit:
y ~ 1 + x + treatment + x:treatment
I'm unsure if this will have any impact on your calculations/code. If it'd be useful to track in absorbed_moderator
whether it's categorical or continuous, let me know.
Given the trouble we went through the last few months trying to understand residualization, I feel like there should be tests for the coefficients and their standard errors before this gets pushed to main
. I think we should also make a package version prior to this commit, and this commit merits a version increment
Making a release and incrementing version seems prudent.
I'm not seeing the risk to putting this in main - it has no impact on models without continuous moderators and passes all existing tests. While I agree it may not be ready for use in actual analysis yet, I don't see how this could negatively affect current analyses.
If a user calls something like
we want to estimate a treatment effect for each level of
sbgrp
via interaction. This interaction is carried out currently without issue, but we do not check nor forcesbgrp
to be categorical.1) Do we want to convert
sbgrp
to categorical ourselves? 2) If yes to 1., do we want to leave it in the model asas.factor(sbgrp)
or do some renaming so that its actuallysbgrp
. Justsbgrp
would look nicer, but be harder to do. Additionally, leaving it asas.factor(sbgrp)
is a nice reminder to users to do that conversion a priori. 3) If yes to 1., do we want to put any restrictions on the variable before we're willing to convert it? E.g. maximum number of unique groups? Something like "Looks likesbgrp
is continuous, so no automatic conversion to factor." 4) If no to 1., do we want an error or warning on a continuous variable being passed?