easystats / effectsize

:dragon: Compute and work with indices of effect size and standardized parameters
https://easystats.github.io/effectsize/
Other
337 stars 23 forks source link

t_to_eta2 vs. t_to_eta2_partial #17

Closed DominiqueMakowski closed 4 years ago

DominiqueMakowski commented 5 years ago

I am coming back to this because I am still not sure it is the best way to go in terms of clarity.

@mattansb (if I'm understanding you correctly) I understand the importance of stressing that these are partial indices in the sense that they are adjusted for everything else in the model. However, this is very common in all models. I.e, you know that all parameters of a linear regression are "partial" in that sense. Hence it seems unnecessary to add it in all the names, also because the non-partial version does not exist for these function, so no confusion is possible. I.e., IMO, it is sufficient to mention that partial aspect in the documentation since it applies to all functions.

Another bigger issue is that this "partial" could be confused with Partial eta2, partial Omega 2 as they are defined in the ANOVA framework, I.e. as adjusted for the sample size. Thus, if I'm correct that these two partial refer to different things, we should remove the former.

Another possibility could be to group them all under a function like:

F_to_r2(x, method = c("eta2", "omega2", "epsilon2")

@strengejacke

mattansb commented 5 years ago

Another bigger issue is that this "partial" could be confused with Partial eta2, partial Omega 2 as they are defined in the ANOVA framework, I.e. as adjusted for the sample size. Thus, if I'm correct that these two partial refer to different things, we should remove the former.

These two partials are actually the same thing: they refer to indices of variance explained when partialling out any modeled (random or fixed) effects (not adjusting for sample size). So when F = (SSeffect / dfeffect) / (SSerror / dferror), The corresponding Partial Eta Squared = SSeffect / (SSeffect + SSerror) Here ^ we only look at the variance of the effect and the variance to that same effect's error - ignoring any other effects (fixed or random).

The function F_to_partial_eta_squared is just a conviniant shortcut for the "real thing". For example, we can compare a "true" partial eta squared (from the afex package`) to out "cheat":

library(effectsize)
library(afex)
data(md_12.1)
fit <- aov_ez("id", "rt", md_12.1,
              within = c("angle", "noise"),
              anova_table = list(correction = "none", es = "pes")
)

# Compare...
fit$anova_table$pes # A true effect size
#> [1] 0.8189831 0.7895522 0.8342857
with(fit$anova_table, { # out cheat
  F_to_partial_eta_squared(`F`, `num Df`, `den Df`)
})
#> [1] 0.8189831 0.7895522 0.8342857

Created on 2019-10-29 by the reprex package (v0.3.0)

All this to say.... I think "partial" should be in the function name to distinguish it from other non-partialled effect sizes.

How about this: F_to_partial_r2(f, df, df_error, method = c("eta2", "omega2", "epsilon2", "adj_eta2"))

DominiqueMakowski commented 5 years ago

But is it possible to have F_to_r2() (i.e., a non-partial R2?). I mean the fact that the explained variance is partial is a property of the model that is reflected by the provided F value, the conversion function doesn't really partialize anything in itself does it?

DominiqueMakowski commented 5 years ago

nother bigger issue is that this "partial" could be confused with Partial eta2, partial Omega 2 as they are defined in the ANOVA framework, I.e. as adjusted for the sample size.

Maybe I confused it with "adjusted", although we say @param partial If \code{TRUE}, return partial indices (adjusted for sample size).

mattansb commented 5 years ago

But is it possible to have F_to_r2() (i.e., a non-partial R2?). I mean the fact that the explained variance is partial is a property of the model that is reflected by the provided F value, the conversion function doesn't really partialize anything in itself does it?

Technically yes - if the F is of the full model against the empty model, this will be a non-partialled R2... but how often does that happen? It is waayyyyyyyy more common to have F of comparing nested models, or Fs of testing effects, or contrasts.... If anything, the fact that it can, in one special case, return a non-partial R2 should be in the documentation.

Maybe I confused it with "adjusted", although we say @param partial If \code{TRUE}, return partial indices (adjusted for sample size).

Then this description is wrong πŸ˜… There is a "generalized" eta squared that does do something with the total sample, but not sure what that is + never seen it outsize a stats class + no "cheat" way to get it 😁

DominiqueMakowski commented 5 years ago

echnically yes - if the F is of the full model against the empty model, this will be a non-partialled R2...

My point exactly, so since it's always partial (unless the special case where partial = non-partial), and since the fact that the explained variance of a given predictor only reflects that of that predictor is a highly expected and normal thing, I see it as a more natural to name it F_to_eta2, and say in the documentation that it is obviously a partial index of variance explained (aside from particular circumstances where partial and non-partial are the same)

Then this description is wrong πŸ˜…

Needs to be changed then πŸ˜…

mattansb commented 5 years ago

Ah, I see what you mean.... I worry though that some might think that these are some magic functions that give non-partial indices :/

Is it the long name that bothers you? We can do F_to_eta2p, which is how it appears in pubs anyway, and the p will tell the informed user that it's partial, and will be a red flag for the uninformed prior user.

DominiqueMakowski commented 5 years ago

The most common use case for this function from users will arguably to provide the column of the F/t and df_error columns from the parameters of a model (let it be an ANOVA, LM or whatever), to get the eta2 for the parameters, if so, and if I understand you, you're suggesting that people might confuse these rΒ²-like values outputted with rΒ² of the relationship between the parameters and the outcome that would be obtained outside of the model?

mattansb commented 5 years ago

Yeah, basically.

I mean, there is a measure called R2, and eta2, etc... I would like the function name reflect the fact that this is not that. (As eta2p >= eta2, having users mistake these would be a problem in their interpretation of their effect sizes...)

DominiqueMakowski commented 5 years ago

Mmh I see. Well, I still think that it's a 'hella verbose way to remind users of a fact expected by most :) IMO the documentation is sufficient (I like to think that most of users actually do read the documentation ^^), we need an 3rd opinion, πŸ¦‡ 🚨 turning on the strenge-signal to call for @strengejacke

mattansb commented 5 years ago

Oh no, not him... πŸ˜‰

-- Mattan S. Ben-Shachar, PhD student Department of Psychology & Zlotowski Center for Neuroscience Ben-Gurion University of the Negev The Developmental ERP Lab

On Tue, Oct 29, 2019, 09:00 Dominique Makowski notifications@github.com wrote:

Mmh I see. Well, I still think that it's a 'hella verbose way to remind users of a fact expected by most :) IMO the documentation is sufficient (I like to think that most of users actually do read the documentation ^^), we need an 3rd opinion, πŸ¦‡ 🚨 turning on the strenge-signal to call for @strengejacke https://github.com/strengejacke

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/easystats/effectsize/issues/17?email_source=notifications&email_token=AINRP6HDYS6AIVXGIZU5M6DQQ7NQLA5CNFSM4JGB4WY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECPOKGY#issuecomment-547284251, or unsubscribe https://github.com/notifications/unsubscribe-auth/AINRP6DYDQYBNR5DYGW6WH3QQ7NQLANCNFSM4JGB4WYQ .

strengejacke commented 5 years ago

As you know, I don't use anova and have nothing to do with those effect sizes... So what's actually the point you like to have a 3rd opinion?

If...

I see it as a more natural to name it F_to_eta2

I would say...

dcdf5b42acb95fa3551faa23a2f1e9e4

DominiqueMakowski commented 5 years ago

It's an issue beyond ANOVAs!

-> do you think that the fact that the explained variance returned by these function is related to that of a parameter adjusted for the other parameters in the model (similarly to any other index related to parameters of a model with more than one parameter) is unexpected or confusing enough to add it in the function name?

mattansb commented 5 years ago

add it in the function name = add the smallest p to the end of it.

strengejacke commented 5 years ago

a parameter adjusted for the other parameters in the model

But isn't this in general the case, for any parameter-related value (p, test-statistic, se, ...)?

DominiqueMakowski commented 5 years ago

That's my perception of it too, hence I am not sure why Master Mattakin is so keen on having it in the name

mattansb commented 5 years ago

Because there is another index called eta2... If a user asks for F_to_eta2, shouldn't he get an eta2??? (in fact, the eta_squared function should have partial set to FALSE by default!)

DominiqueMakowski commented 5 years ago

Because in F_to_eta2, the fact of being "partial" is a property of F (itself a property of the parameter of a model which is by nature partial). Thus, converting F to eta2 is partial because of the provided input, not because of the computation happening in the conversion function

(nobody can say that we do not discuss in depth the naming here in easystats 😁)

strengejacke commented 5 years ago

The difference between eta2 and partial eta2 is just that the denominator for eta2 is the total sum of squares, while for partial eta2 it is the parameter related ss + residual ss. Hence, is it possible to convert F to both eta2 and partial eta2? If so, I would name it F_to_eta2 and add a partial-argument.

mattansb commented 5 years ago

Because in F_to_eta2, the fact of being "partial" is a property of F

If you assume users to know this, you have better faith than me πŸ˜…

is it possible to convert F to both eta2 and partial eta2?

No, a single F can only give partial eta2...

Okay, I get it... I lost this one... 😭

I will change the function names accordingly... F_to_eta2, etc...

DominiqueMakowski commented 5 years ago

*Daniel discreetly stores Dom's 10 dollars bill into his pocket*

strengejacke commented 5 years ago

@mattansb

giphy

mattansb commented 5 years ago

That's just mean πŸ˜…

DominiqueMakowski commented 5 years ago

I'd even say it's demean

mattansb commented 5 years ago

image

mattansb commented 4 years ago

@DominiqueMakowski / @strengejacke Recently a colleague was telling me about his usage of effectsize and the F_to_* functions. Unfortunately, they were under the impression that these function are for eta squared; when I told them they were for partial eta squared, they asked "why isn't that somewhere in the function name", and "but partial eta is it's own thing".

I know N=1, but see above from my strong priors of this being a problem.

DominiqueMakowski commented 4 years ago

since I have the opposite strong prior, this N=1 changes nuffin 😈

jk, okay for the bold, but the capitalized is uglyyyy

mattansb commented 4 years ago

How about F_to_eta2 -> F_to_eta2p? πŸ™

DominiqueMakowski commented 4 years ago

mattansb commented 4 years ago

https://twitter.com/mattansb/status/1202204182264066048 ....

DominiqueMakowski commented 4 years ago

the fact that it's partial is a property of the parameter and not of the function. In a multiple regression model all the indices are "partial" in the sense that they are only related to a given parameter and "adjusted" for the others. And we wouldn't create a "partial_pvalue" function?

mattansb commented 4 years ago

If there was a non-partial p-value, then yes!

DominiqueMakowski commented 4 years ago

mattansb commented 4 years ago

In Hebrew there is a saying: "Don't block the path of a blind man"... You assume too much of the users here - that they will either know the properties of F, or that they will actually read the documentation (or that they will not forget after using it once what it means when they code-and-paste to re-use their code)...

strengejacke commented 4 years ago

Fixed:

library(effectsize)
F_to_eta2(16.501, 1, 9)
#> 
#> 
#> CAUTION!!! THIS FUNCTION RETURNS THE *P*A*R*T*I*A*L* ETA-SQUARED!!11!
#> [1] 0.6470727

Created on 2019-12-04 by the reprex package (v0.3.0)

mattansb commented 4 years ago

I think adding a p to the function name would suffice πŸ˜‚

DominiqueMakowski commented 4 years ago

and for a simple regression model with one parameter this functions technically compute... a non-partial eta2. Again, because whether it is partial or not doesn't have anything to do with the function, and the function will return the same output for a given F if it is a partial F and if it's not. Hence, no sense of naming a function with something irrelevant to what it does

In Hebrew there is a saying: "Don't block the path of a blind man"...

You assume too much of the users here - that they will not read the documentation because they believe they already know. But actually many people do read πŸ˜‰

mattansb commented 4 years ago

It amazes me that you two, who always want clear initiative function names, are so resistant to a simple little p that would add clarity to the function's result...

Is the easystats guidelines that function names and outputs should assume users has read the docs?

Why have distribution_chisquared? Even if there is chi (not-squared) distribution, isn't the "chi" in practice always squared? Why not have distribution_chi? Perhaps it should just be F_to_eta?

DominiqueMakowski commented 4 years ago

Why have distribution_chisquared? Even if there is chi (not-squared) distribution, isn't the "chi" in practice always squared? Why not have distribution_chi? Perhaps it should just be F_to_eta?

πŸ˜… πŸ˜…

even eta is an arbitrary name though, only used because of conventions that we must overcome. I suggest something_to_something

strengejacke commented 4 years ago

We could simply completely remove that function?

DominiqueMakowski commented 4 years ago

and adding this little p unnecessarily (coz it's not about what the function does) breaks the a_to_b syntax. We cannot be the ever-present guardians of good practices, if the users misuses that function and thinks that he magically obtained some index related to something else than its parameter, well too bad!

DominiqueMakowski commented 4 years ago

Daniel about the function:

mattansb commented 4 years ago

We cannot be the ever-present guardians of good practices, if the users misuses that function and thinks that he magically obtained some index related to something else than its parameter, well too bad!

You are technically correct. But it is still my opinion that a little p will do much more good than harm (what harm does it do?)

mattansb commented 4 years ago

wait - @DominiqueMakowski @strengejacke are you so Bayesian and anti p-value that you won't add the p to the function name??

What next?? will parameters become arameters??

DominiqueMakowski commented 4 years ago

what harm does it do?

it breaks the beauty of the a_to_b syntax, and since easystats is beauty, you wouldn't wanna break easystats would you?

DominiqueMakowski commented 4 years ago

What next?? will parameters become arameters??

You'll notice that there is no p in Dominique Makowski, Daniel LΓΌdecke and Mattan Ben-Sachar...

mattansb commented 4 years ago

it breaks the beauty of the a_to_b syntax

Not sure how F_to_eta2p breaks this?

You'll notice that there is no p in Dominique Makowski, Daniel LΓΌdecke and Mattan Ben-Sachar...

image

strengejacke commented 4 years ago

F_to_eta2_partial() would be the name according to our naming convention, if I interprete our rules right (and if we want to denote the "partial" property in the function name).

Since I will probably never convert an F into an eta2 or any other greek letter, I'm actually not so keen about this function name...

DominiqueMakowski commented 4 years ago

psss, Daniel, could you furtively add to the our convetions "the name of a function should be related to what the function does"

mattansb commented 4 years ago

Does the function not compute a partial eta square? pls explain

strengejacke commented 4 years ago

We could mention an unwritten rule that conversion functions are only allowed to have one proncounceable greek letter before and after "to"