Expand MLM R2 measures - Githubissues

mattansb commented 4 years ago

Paper 1: https://doi.org/10.1037/met0000184 Paper 2: http://doi.org/10.1080/00273171.2019.1660605

These should expand on the already existent marginal and conditional R2, as the break it down by level and source of variation. (That is why it is not for effectsize, as these aren't split by effect or term, but by fixed/random/type, and so are model-wise effect sizes).

(As implemented in r2mlm)

TarandeepKang commented 4 years ago

I was thinking of mentioning this to you all, but it appears that you beat me to it! :-) I should say I've never written a custom function before or had much experience with GitHub. But, if you all feel it would be helpful, I would be glad to try to help with the implementation!

TarandeepKang commented 4 years ago

I'm going to take that thumbs up for a "yes"? If so, how would you like me to get started? Any kind of pointers would be great!

I have used a lot of coding for analyses before, but never entered any functionality to a package. I am brand-new at this, you have been warned! :-)

DominiqueMakowski commented 3 years ago

Hey @TarandeepKang sorry it seems like we forgot to follow-up on this, as you know these last few months have been quite bumpy. If you're still interested, I would suggest starting by 1) giving a look at r2mlm, to understand the 2 steps-process (1. Get all necessary ingredients from models 2. Throw them in r2mlm_manual() and let the magic happen). Then, try re-writing step 1 as a generic function using the power of insight package which should facilitate retrieving all of the ingredients. Then we can try to understand step 2 and see if there's a need to reimplement / rewrite it to accomodate more models.

DominiqueMakowski commented 3 years ago

I added here a working file in which I decompose the two steps, namely 1) extracting ingredients 2) calculating the indices.

So first step is I think to revise the step 1 function to replace all the internal functions by something more generalizable using insight.

DominiqueMakowski commented 3 years ago

So extracting the ingredients (step 1) is the part that we can really generalize / simplify so that it works for more models. Then, the output of this "init" function is passed to r2mlm_manual() which does some heavy arithmetics with it.

I have isolated this "init" step that works for lme4 models:

Call https://github.com/easystats/performance/blob/89512a511fcee57a2960bb9e4c555c5afad4bb0b/WIP/r2mlm_test.R#L1-L13

Result


$Decompositions
            total              within             between
fixed, within   0.0819107586265126 0.142972810913675  NA     
fixed, between  0                  NA                 0      
slope variation 0.0377525833965782 0.0658960197411068 NA     
mean variation  0.42708856248221   NA                 1      
sigma2          0.453248095494699  0.791131169345218  NA

$R2s total within between f1 0.0819107586265126 0.142972810913675 NA
f2 0 NA 0
v 0.0377525833965782 0.0658960197411068 NA
m 0.42708856248221 NA 1
f 0.0819107586265126 NA NA
fv 0.119663342023091 0.208868830654782 NA
fvm 0.546751904505301 NA NA



For now, it still relies on the tidyverse quite a lot (which we would need to get rid off if we were to implement this R2MLM robust initalization), and it doesn't work for nlme models because model.frame() returns something different (there are still a few if/else lme4 or nlme switches that insight didn't manage to remove, for instance, for `get_parameters("random")` for lme4 it returns a list, but a dataframe for nlme).

(@strengejacke looking at [init](https://github.com/easystats/performance/blob/master/WIP/r2mlm_init.R) and [utils](https://github.com/easystats/performance/blob/master/WIP/r2mlm_utils.R) I feel like there's quite a lot we can simplify don't you think?)

TarandeepKang commented 3 years ago

As I said before, I am happy to help, just let me know when exactly you need doing. You're suggesting some steps for me above, but then it looks like you've already done them? :-)

DominiqueMakowski commented 3 years ago

Haha yeah I gave it a quick go to prepare the terrain. Essentially the objectives would be to 1) replace all the tidyverse functions, 2) fix it for nlme. Once this is done, we can add the step2 part with the actual calculations.

mkshaw commented 3 years ago

I'd be happy to hop in here, as one of the original r2mlm authors! I'm a bit pressed for time in the next few weeks, but mid/late-April I can implement (1) and (2).

DominiqueMakowski commented 3 years ago

Hey @mkshaw super glad you hopped in! I was about to tag you here very soon anyway - since yesterday I finally had the occasion of giving a look to this issue ☺️

As you know we'd like to offer to users access to this feature + evaluate the possibility of generalizing it so that it can accomodate other models like glmmTMB, gamm4 and... Bayesian models? (not sure if that's possible though 😬)

As far as I understood r2mlm you separate the process in two steps, preprocessing and computation. My initial idea was to implement a preprocessing pipeline that is compatible with easystats (i.e., doesn't use tidyverse functions + uses insight to gather the model's info to simplify and streamline the code) - and hopefully by doing this, support for models like glmmTMB would naturally emerge. Then, I planned to still call your package's r2mlm::r2mlm_manual() to do the magic.

For now this whole thing is WIP and more in the "evaluation" stage (to see whether we can bring something to the table), so feel free to let us know any thoughts, issues or ideas that you might have!

mkshaw commented 3 years ago

@DominiqueMakowski You understand the two-step process in r2mlm correctly! The plan to preprocess with easystats and then call r2mlm::r2mlm_manual() sounds good to me, and I'd love to see support for models like glmmTMB naturally emerge from that. Like I mentioned, I'd be happy to hop in and try to implement the easystats preprocessing in mid/late-April -- I've been looking for ways to familiarize myself with various easystats packages, and this seems like a good one!

easystats / performance

Expand MLM R2 measures #149