summ() and standardization in longitudinal models

jacob-long / jtools

Tools for summarizing/visualizing regressions and other helpful stuff

GNU General Public License v3.0

165 stars 22 forks source link

I've been using your amazing summ() function. The documentation notes that it supports merMod objects and that it can rescale using Gelman’s 2 SD standardization method(n.sd = 2).

I have two sets of questions:

1) I was wondering how summ() does the 2-SD rescaling. How does summ() treat time-invariant variables? What about binary variables (both those that are time-varying and those that are time-invariant)? My understanding (which could be wrong) is that binary variables shouldn't be rescaled. Apologies if you already mention this in documentation. I didn't find it.

2) I came across cautions about standardization in longitudinal models (link below). Standardization changes in often undesirable ways the distances between observations, and the multivariate distributions of cross-sectional and longitudinal data. The below article recommends monotonous scale transformations to get items with different response scales to the same metric. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4569815/ So using summ() and 2-SD rescaling with a lmer() model looks problematic. My question: If you agree this is an issue, are there plans to add monotonous scale transformations to summ()?

I'm relatively new to longitudinal modeling, so I hope nothing I'm asking here is a bad question.

Thanks for your great packages!

Sincerely, Sam

Hi Sam, I'd say the best way to summarize how the scaling is done is that it does it in a way that is completely ignorant to the longitudinal format. It figures out what data you gave to the model fitting function and then goes column by column calculating SD/mean treating every observation equally. In the terms of the linked article, as best as I can tell, this package is doing "standardization across individuals across time points." I agree with the author of the linked article that basically all sorts of scaling are dangerous to the proper interpretation of longitudinal models.

I do have an open issue about POMP scaling (#33) but have not yet implemented it. I'm not sure whether/how I'd add the option to summ() since I've already made the function so very complicated. I can say with some more confidence that it's unlikely I'll add anything to jtools that takes longitudinal/multilevel structure into account for an operation like variable rescaling.

jacob-long / jtools

summ() and standardization in longitudinal models #136