Open mattansb opened 4 years ago
I think the idea behind centering at mean is that this removes the correlation between higher and lower level predictors in a mixed models context. So for "demeaning", the centering at the mean value would be appropriate. Nonetheless, we could add further options.
How I was taught it, person-centering is done to split two effects of X on Y: the stable "trait" of X and the unstable "situational" part of X. Using the mean as a measure of the "trait" part also has the benefit of uncorrelating these parts, but the actual centrality index depends on the researcher's question (e.g., "How do differences in the starting values of x predict y, and how do changes from the starting point predict y?" would use x[time==0]
for centering, etc...).
I thought the crucial part is the correlation of level1 and level2 predictors, which violates model assumptions? I agree that centering at other sensible values might be better in certain occasions, but here it was a pure statistical / mathematical reason to choose the mean? Anyway, we can enhance this method.
I thought the crucial part is the correlation of level1 and level2 predictors, which violates model assumptions?
Hmmm which assumption? Plain old multicollinearity? If its bad, it will hurt interpretability of the coefficiants - so there is a possible trade off here between interpretability and interpretability 😅 But afaik a little multicollinearity never hurt anyone 😎
Some ideas for expanding the
demean
function into a more generaldegroup
(ordecenter
, or ??) function (listed by ease of implementation as I perceive it):median()
,min()
,max()
, alsoMode()
is popular for categorical predictors.Allow for more than 1 grouping var Order of operations would be: split
y
byG1
, then splity_between
byG2
, etc... (Would need a better naming scheme?)Center around an indexed value. For example, center
y
aroundy[time==0]
, ory[condition=="a"]
. Can be mixed with (1):max(y[time==0])
, etc.