[Feature Request] Increase parallels between `tidy.brmsfit` and `tidy.stanreg`?

emstruong commented 1 week ago

It would be a really nice and appreciated feature if the tidy.brmsfit and tidy.stanreg functions had overlapping arguments and functionality.

Some notable differences are:

tidy.brmsfit uses non-robust estimates by default, whereas tidy.stanreg uses the robust estimate
tidy.stanreg doesn't seem to allow the extraction of rhat and ess for each parameter, whereas tidy.brmsfit does
tidy.brmsfit reports the standard errors and confidence intervals for all parameters. Yet for tidy.stanreg, they seem to be missing for random effects for some reason that I haven't been able to figure out.
- I'm going to guess that this is more of a rstanarm problem with what it's storing in ses
tidy.brmsfit's documentation says that you can get "ran_vals", but the default-arguments/documentation of usage implies that it's only fixed or random parameters

Here is a reprex of the differences

library(rstanarm)
library(brms)
library(broom.mixed)
fit <- stan_glmer(
  mpg ~ wt + (1 | cyl) + (1 + wt | gear),
  data = mtcars,
  iter = 500,
  chains = 2
)

tidy(fit)
#> # A tibble: 7 × 4
#>   term                    estimate std.error group   
#>   <chr>                      <dbl>     <dbl> <chr>   
#> 1 (Intercept)               32.7        3.83 <NA>    
#> 2 wt                        -3.98       1.09 <NA>    
#> 3 sd_(Intercept).cyl         3.22      NA    cyl     
#> 4 sd_(Intercept).gear        2.11      NA    gear    
#> 5 sd_wt.gear                 1.06      NA    gear    
#> 6 cor_(Intercept).wt.gear   -0.296     NA    gear    
#> 7 sd_Observation.Residual    2.64      NA    Residual

brm_fit <- brm(
  mpg ~ wt + (1 | cyl) + (1 + wt | gear),
  data = mtcars,
  iter = 500,
  chains = 2,
  silent = 2
)

tidy(brm_fit)
#> # A tibble: 7 × 8
#>   effect   component group    term         estimate std.error conf.low conf.high
#>   <chr>    <chr>     <chr>    <chr>           <dbl>     <dbl>    <dbl>     <dbl>
#> 1 fixed    cond      <NA>     (Intercept)    32.5       4.96   22.1       43.0  
#> 2 fixed    cond      <NA>     wt             -4.20      1.47   -7.56      -1.32 
#> 3 ran_pars cond      cyl      sd__(Interc…    3.53      2.62    0.160     10.1  
#> 4 ran_pars cond      gear     sd__(Interc…    3.94      3.16    0.0728    11.9  
#> 5 ran_pars cond      gear     sd__wt          1.61      1.29    0.0728     4.92 
#> 6 ran_pars cond      gear     cor__(Inter…   -0.460     0.491  -0.994      0.702
#> 7 ran_pars cond      Residual sd__Observa…    2.58      0.362   1.99       3.39

^{Created on 2024-11-15 with reprex v2.1.1}

bbolker commented 1 week ago

This seems like a perfectly reasonable (although breaking backward compatibility). For the places where one platform's tidy-output doesn't obviously dominate the other (e.g. we presumably want to have rhat and ess available for both platforms, not neither ...), which way do you suggest we make them consistent? e.g. should both platforms return robust or non-robust summaries by default?

Any chance of a pull request ... ??

emstruong commented 1 week ago

I would normally agree to do a pull-request, but I am unfortunately far past my max bandwidth and really shouldn't agree to anything. Sorry.

My memory is that rstanarm and brms are not entirely consistent in whether the reported parameter estimates (summary(mod)) are robust or not. Although rstanarm prints the robust estimates (print(mod)), the summary of the rstanarm model seems to be the non-robust estimates. Whereas brms just has the non-robust estimates.

In my opinion, tidy should be consistent with the original package's intent, so then I think we should keep tidy.brms to non-robust. It may be better to ask the rstanarm people what they really want.

Although, if it were up to me, they would both default to the robust estimates. For context, I'm developing a Monte Carlo sim comparing rstanarm/brms to lmer--when used with as many default settings as possible--and what I found is that the non-robust estimates can sometimes be beyond completely absurd.

If I were to request an early christmas gift, we'd also get the bulk and tail ESS of the parameter estimates.

emstruong commented 1 week ago

Though... If I were to make a pull-request should I adjust the rstanarm tidy function to still depend on fit$ses or can I just manually compute all the ses myself? It seems like rstanarm will take a while to implement their fix...

emstruong commented 1 week ago

So I don't know enough about HMC or Bayes to know if this is a problem, but broom.mixed:::tidy.rstanarm() uses rstanarm::VarCorr() to get the estimates of the random parameters. But rstanarm::VarCorr() uses colMeans() of each of the ^Sigma columns. I would've thought that a robust estimate would use the median of the column as opposed to the mean of the column.

EDIT: Given that tidy(fit, robust = TRUE) and tidy(fit, robust = FALSE) can vary for brms models for the random parameters, I'm going to assume that the use of colMeans() versus the median matters. What slightly confuses me is that tidy.brmsfit seems to simply apply the median or mean to the draws, whereas tidy.rstanarm gets the VarCorr.

I assume that this is due to differences in the parameterization of the random parameter between brms and rstanarm?

bbolker commented 1 week ago

I wouldn't necessarily assume that. Haven't dug into this/thought about this carefully, but a lot of the machinery of these two methods may have been contributed by others/stolen from other places, so inconsistencies might be entirely accidental ... FWIW brms:::VarCorr.brmsfit has a robust argument, so tidy.brmsfit could (and should probably) use it instead of working with the draws directly. But that doesn't help with tidy.rstanarm - you could include a wish for a robust option in your existing open rstanarm issue ...

emstruong commented 1 week ago

Oh I think there may be a mis-understanding, maybe, what I'm guessing is happening is that in brms, the stan code is such that you get the draws of the standard deviation of the random effects directly. However, in rstanarm, the draws are some pre-cursor to the standard deviation of the random effects. That's what I mean by differences in parameterization.

Regardless, I understand wanting to work with brms::::VarCorr.brmsfit instead of directly with the draws, but I'm getting the feeling that it may be better to work directly with whatever output the package itself is providing, as opposed to computing it ourselves or even using other nominally-related packages. When I tried getting the tail_ess from the draws using posterior::ess_tail() , it gave me a slightly different number than the tail_ess of the brms summary output.

So I'm confused...

bbolker commented 1 week ago

I absolutely agree that using package-specific accessors wherever possible is the best practice. This is why it might be nice to request a robust option for rstanarm::VarCorr... it would seem easy enough to add by adding the argument and changing one line of code accordingly, i.e. something like

sumfun <- if (robust) median else mean
scols <- grepl("^Sigma\\[", colnames(mat))
Sigma <- apply(mat[,scols, drop = FALSE], 1, sumfun)

There would be a tiny performance cost using apply() rather than colMeans for non-robust estimates - could also do

scols <- grepl("^Sigma\\[", colnames(mat))
if (robust) {
   Sigma <- apply(mat[,scols, drop = FALSE], 1, median)
} else {
   Sigma <- colMeans(mat[,scols, drop = FALSE])
}

if preferred ...

emstruong commented 1 week ago

Hold on now though, but doesn't your one-liner require that we apply it on the draws? I thought you said you were against applying it on the draws. :sweat_smile:

Well either way, the one-liner alone won't make rstanarm produce the standard errors for the random parameters. We'd probably need them to fix things up first.

bbolker commented 1 week ago

The code I showed is based on the rstanarm code (i.e., material for a pull request there), not the broom.mixed code ...

bbolker / broom.mixed

[Feature Request] Increase parallels between `tidy.brmsfit` and `tidy.stanreg`? #156