easystats / report

:scroll: :tada: Automated reporting of objects in R
https://easystats.github.io/report/
Other
693 stars 69 forks source link

Drop interpretation of std slopes / parameters #97

Open mattansb opened 3 years ago

mattansb commented 3 years ago

From the README:

## We fitted a linear model (estimated using OLS) to predict Sepal.Length with Species (formula =
## Sepal.Length ~ Species). Standardized parameters were obtained by fitting the model on a
## standardized version of the dataset. Effect sizes were labelled following Funder's (2019)
## recommendations.

However, Funder's (2019), with r guidelines, should not be used for interpreting Beta values (see discussion here: https://github.com/easystats/effectsize/issues/127).

strengejacke commented 3 years ago

@DominiqueMakowski That's your issue :-)

strengejacke commented 3 years ago

I'm not sure how psychos name it, but I'm used to use "b" for refer to unstandardized, and "beta" to standardized coefficients.

mattansb commented 3 years ago

I'm not sure how psychos name it, but I'm used to use "b" for refer to unstandardized, and "beta" to standardized coefficients.

Same (:

strengejacke commented 3 years ago

So we might think of changing:

The effect of (Intercept) is positive and can be considered as very large and significant (beta = 5.01, SE = 0.07, 95% CI [4.86, 5.15], std. beta = -1.01, p < .001).

into

The effect of (Intercept) is positive and can be considered as very large and significant (b = 5.01, SE = 0.07, 95% CI [4.86, 5.15], std. beta = -1.01, p < .001).

?

DominiqueMakowski commented 3 years ago

but I'm used to use "b" for refer to unstandardized, and "beta" to standardized coefficients.

though that's a very confusing distinction I never understood where it came from, "b" is literally the poorman's ascii way of representing the beta symbol which stands for "beta" which stands for the coefficient of the equation. And standardized betas are, well, std. beta ☺️

So I'd agree to change beta for b (implicitly being the beta symbol, which we could swap in latex/md outputs maybe), but we keep it for std. b then so that it's consistent what do you say?

strengejacke commented 3 years ago

I'm not sure if there's a general agreement on this. Maybe "b" and "std. beta"?

mattansb commented 3 years ago

That seems like a good compromise that leaves no ambiguity

mattansb commented 3 years ago

To reiterate: This is not an issue with Funder's rules, it is a problem with interpretation of std slopes / parameters in general - there are no rules of thumb because the value of std beta depends not only on the partial correlation between xi and y, but also on the multicolinearity between the Xs, the relationship between the other predictors and Y, etc.

However... if we want we can look at the partial effect sizes, e.g., t_to_eta2, t_to_d or t_to_r, and interpret these.

DominiqueMakowski commented 3 years ago

That seems like a good compromise that leaves no ambiguity

Regardign the text, I very strongly vote for either "b" and "std. b" or "beta" and "std. beta". Again, both refer to the beta symbol which is the convention for regression coefficients. Or we could go with "coefficient" and "std. coefficient" but that's long and APA suggests to replace by beta symbols anyway which b or beta would refer to.

Now regarding the automatic interpretation of these coefs............... Well, it's been my secret desire and goal for a long time, hence the effectsize "standardization of standardized indices" monstrosity. But I agree that currently there is no clean solution, so we might want to drop the interpretation for now altogether. Or, we could only add the interpretation for non-interaction terms (i.e., interpret the coefs as either partial correlation for continuous predictors or as standardized difference for level difference), where it is more straightforward.

As for the conversion to partial effect sizes, it did seem as a good avenue but what with Bayesian models...

mattansb commented 3 years ago

As for the conversion to partial effect sizes, it did seem as a good avenue but what with Bayesian models...

We do have eta_squared_posterior() that can return partial eta squared for linear (non-mixed) model (for mixed there are only non-partialled eta-squared for now).

bwiernik commented 3 years ago

Regardign the text, I very strongly vote for either "b" and "std. b" or "beta" and "std. beta". Again, both refer to the beta symbol which is the convention for regression coefficients. Or we could go with "coefficient" and "std. coefficient" but that's long and APA suggests to replace by beta symbols anyway which b or beta would refer to.

My personal strong preferences is to use "β" and "std. β" for the coefficients. The "b" versus "β" distinction seems in my experience to mostly only occur in Gaussian linear regression (e.g., most presentations of GLMs using β throughout), and many classic regression texts using different notation (e.g., Cohen et al. used "b" and "B"). It's inconsistent enough that folks really shouldn't rely on a "b = unstandardized, β = standardized" heuristic.