Weighted rvars - Githubissues

mjskay commented 10 months ago

Summary

This PR aims to address (at least part of) #184 by implementing weighted rvars.

Currently, rvars cannot contain weights, and weighting of them can only be done by putting them in a draws_rvars object that itself contains a ".log_weight" rvar containing the weights. This leads to counterintuitive behavior, like the default output of the rvar (showing mean and sd) using unweighted versions of those statistics.

This PR addresses that issue in the following ways:

It stores rvar weights as a "log_weights" attribute directly on the rvar, just like the "nchains" attribute is used to store chain count.
draws_rvars no longer use a ".log_weight" variable to store weights, instead storing them directly on each rvar they contain, and requiring all rvars they contain to have the same weights (the same way it handles "nchains").
A new log_weights() function for draws and rvars is added, which is a lower-level version of weights(x, log = TRUE, normalize = FALSE) that just returns the log weights stored in the object without transformation. I initially did not have this, but found it greatly eased programming with weights.
weight_draws(x, NULL) is now allowed as the canonical way to remove weights from a draws object, since remove_variables(x, ".log_weight") does not work on draws_rvars objects anymore.
All summary functions for rvars have been updated to incorporate weights (with a couple of exceptions I haven't gotten to yet, see TODOs and Questions below).
Since rvar internals are becoming (even more) complicated, I have added an "rvar Internals" section to ?rvar that hopefully will help in case others need to touch the code ;).

Demo

set.seed(1234)
x = rvar(rnorm(1000))
x
#> rvar<1000>[1] mean ± sd:
#> [1] -0.027 ± 1

w1 = rexp(1000)
x1 = weight_draws(x, w1)
x1
#> weighted rvar<1000>[1] mean ± sd:
#> [1] -0.00087 ± 1

w2 = rexp(1000)
x2 = weight_draws(x, w2)
x2
#> weighted rvar<1000>[1] mean ± sd:
#> [1] -0.003 ± 0.96

You can't combine two rvars with different weights:

x1 + x2
#> Error: Random variables have different log weights and cannot be used together:
#> <dbl> 0.794199981930473, -1.61888922585584, 1.02558084358998, -0.657945687118312, 0.132635682996154 ...
#> <dbl> 0.661721670766407, -1.46589074644228, -1.39312536919089, 0.318133129307739, 0.66043235310858 ...

The check for equality of weights is done on the internal weights using identical(), which should be fast, especially in cases where the two weight vectors are actually pointers to the same vector in memory (in which case the comparison is constant time). This does mean the weights vectors must be exactly the same (no tolerance for floating point error), but I suspect in most cases when weighting happens the exact same weight vector is being applied to many objects. In any case, if someone did encounter this issue they could simply assign the log weights from object to the other.

If one rvar is weighted and another is not, the weights of the weighted rvar are inherited, which I believe covers the use case of (weighted draws from some model) + (unweighted draws, e.g. used to simulate predictions):

x1 + rvar(rnorm(1000, 1))
#> weighted rvar<1000>[1] mean ± sd:
#> [1] 0.96 ± 1.4

If you install the dev version of {ggdist}:

remotes::install_github("mjskay/ggdist")

It supports weighted rvars in all functions (densities, CDFs, quantiles, all interval types and all point summaries):

Without weights:

library(ggplot2)
library(ggdist)

set.seed(1234)
x = rvar(rnorm(10000, c(1,5)))

ggplot() + stat_slabinterval(aes(xdist = x))

With weights:

xw = weight_draws(x, rep(c(1,2), 5000))
ggplot() + stat_slabinterval(aes(xdist = xw))

Weights should work basically everywhere:

ggplot() + 
  stat_slabinterval(
    aes(xdist = xw), 
    point_interval = mode_hdi, 
    density = "histogram", 
    breaks = 50
  )

TODOs and Questions

TODOs:

[x] I still have to implement density(<rvar>), cdf(<rvar>), and quantile(<rvar>) / quantile2(<rvar>). The first two are straightforward. For weighted quantiles, I have an implementation in {ggdist} that I can port over, but I may want to update it first; some thoughts on weighted quantiles are here and feedback is welcome. (In fact, since writing that document my thinking has changed a bit---I originally thought the way I suggested implementing weighted quantiles in that document is an improvement on ggdist's current implementation, but after further investigation I might be leaning back towards how I did it in ggdist originally...).
[x] Mention weights in vignette("rvar")

Questions:

I don't know if any of the other functions in R/convergence.R should be modified for weighted rvars. @avehtari?
Have I missed anything else?

Would love for folks to kick the tires. I think once this is in we could also start thinking about what a successor to summarise_draws() might look like that supports weights (and solves the various other open issues on summarise_draws()).

Copyright and Licensing

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

codecov-commenter commented 10 months ago

Codecov Report

Attention: Patch coverage is 98.89299% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 95.80%. Comparing base (c312846) to head (88fa83d). Report is 12 commits behind head on master.

:exclamation: Current head 88fa83d differs from pull request most recent head 1079cef. Consider uploading reports for the commit 1079cef to get more accurate results

Files	Patch %	Lines
R/rvar-.R	94.87%	2 Missing :warning:
R/weighted.R	98.07%	1 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #331 +/- ## ========================================== + Coverage 95.31% 95.80% +0.49% ========================================== Files 50 51 +1 Lines 3840 3979 +139 ========================================== + Hits 3660 3812 +152 + Misses 180 167 -13 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

github-actions[bot] commented 10 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if bdef35c34867a28c7e956da43881bffde8bda583 is merged into master:

:rocket:as_draws_array: 105ms -> 104ms [-1.63%, -0.4%]
:exclamation::snail:as_draws_df: 33.7ms -> 94ms [+175.58%, +182.58%]
:rocket:as_draws_list: 211ms -> 204ms [-5.15%, -1.39%]
:ballot_box_with_check:as_draws_matrix: 31.8ms -> 31.4ms [-3.23%, +0.95%]
:ballot_box_with_check:as_draws_rvars: 171ms -> 171ms [-1.01%, +1.18%]
:ballot_box_with_check:summarise_draws_100_variables: 734ms -> 729ms [-2.86%, +1.39%]
:ballot_box_with_check:summarise_draws_10_variables: 80.8ms -> 80.7ms [-1.76%, +1.72%] Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 10 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 8c803b8d6eb92d3c521547cd90917d32fd9aa5de is merged into master:

:ballot_box_with_check:as_draws_array: 100ms -> 99.6ms [-1.8%, +0.21%]
:exclamation::snail:as_draws_df: 31.6ms -> 81ms [+152.76%, +160.78%]
:ballot_box_with_check:as_draws_list: 181ms -> 181ms [-1.9%, +2.49%]
:ballot_box_with_check:as_draws_matrix: 28.7ms -> 28.7ms [-1.29%, +1.28%]
:ballot_box_with_check:as_draws_rvars: 157ms -> 154ms [-3.76%, +0.92%]
:ballot_box_with_check:summarise_draws_100_variables: 709ms -> 707ms [-0.9%, +0.31%]
:ballot_box_with_check:summarise_draws_10_variables: 78.2ms -> 78.5ms [-0.32%, +1.02%] Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 10 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if cc2cb23ca0f0065665b2a0b1f2279d020df642cf is merged into master:

:ballot_box_with_check:as_draws_array: 102ms -> 100ms [-5.82%, +1.51%]
:exclamation::snail:as_draws_df: 32ms -> 84.5ms [+153.07%, +174.62%]
:ballot_box_with_check:as_draws_list: 189ms -> 185ms [-4.44%, +0.36%]
:ballot_box_with_check:as_draws_matrix: 30.3ms -> 29.9ms [-4.52%, +2.36%]
:ballot_box_with_check:as_draws_rvars: 167ms -> 166ms [-3.7%, +2.49%]
:ballot_box_with_check:summarise_draws_100_variables: 722ms -> 728ms [-0.02%, +1.66%]
:ballot_box_with_check:summarise_draws_10_variables: 80.2ms -> 80.1ms [-0.76%, +0.59%] Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 10 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 1c1c4b7645fa4e56332a23089e9741764f8406cb is merged into master:

:ballot_box_with_check:as_draws_array: 102ms -> 101ms [-2.01%, +0.29%]
:exclamation::snail:as_draws_df: 32.5ms -> 88.2ms [+159.27%, +183.41%]
:ballot_box_with_check:as_draws_list: 194ms -> 193ms [-2.99%, +2.57%]
:ballot_box_with_check:as_draws_matrix: 30.2ms -> 30.2ms [-2.71%, +2.5%]
:ballot_box_with_check:as_draws_rvars: 163ms -> 163ms [-1.88%, +2.68%]
:ballot_box_with_check:summarise_draws_100_variables: 725ms -> 718ms [-2.58%, +0.59%]
:ballot_box_with_check:summarise_draws_10_variables: 79.2ms -> 79.5ms [-0.41%, +1.16%] Further explanation regarding interpretation and methodology can be found in the documentation.

avehtari commented 10 months ago

I don't know if any of the other functions in R/convergence.R should be modified for weighted rvars.

Currently, everything else than pareto_ functions assume non-weighted MCMC. I have so far assumed that MCMC and weighting are independent of each other (there might be some less common algorithms that jointly sample parameter values and weights).

In PSIS paper experiments, I computed separately ESS for MCMC and ESS for PSIS and combined them as ESS_MCMC*ESS_PSIS/S, which worked well for getting MCSE that matched RMSE (given khat<0.7)
I did not try what would happen if in ESS for MCMC computation we would just replace autocorrelation computation with functions that would use weighted means and variances, and I have not checked what it would produce, but I guess that would be not what we want.
It would be nice to have a flag stating if rvar does not have Markov dependency, and then ESS and MCSE would be based just on weights.
If there are both (assumed) Markov dependency and weights, we could follow the approach presented in PSIS paper for ess_ and mcse_ functions
If we assume MCMC and weighting are independent, then rhat_ and rstar could do the MCMC convergence check without weights (until we are aware of an algorithm that would have the weighting inside the MCMC already)
pareto_ functions are checking the tail(s) of a given argument, and it has been used to check tails of raw weights/ratios (r or r(theta) in PSIS paper notation), function of a variable (h or h(theta)), or the product (hr). With the weight support, they could automatically make the diagnostics for r and hr (and if no weights then just h). Here I'm assuming that we almost always use self-normalization so that we need to check the normalization (E[r]) and the quantity of interest (E[hr])

Pinging @n-kall , too

n-kall commented 10 months ago

If there are both (assumed) Markov dependency and weights, we could follow the approach presented in PSIS paper for ess_ and mcse_ functions

For reference: Equations 6 (MCSE) and 7 (ESS) in preprint v6

pareto_ functions are checking the tail(s) of a given argument, and it has been used to check tails of raw weights/ratios (r or r(theta) in PSIS paper notation), function of a variable (h or h(theta)), or the product (hr). With the weight support, they could automatically make the diagnostics for r and hr (and if no weights then just h). Here I'm assuming that we almost always use self-normalization so that we need to check the normalization (E[r]) and the quantity of interest (E[hr])

Any thoughts on how the two sets of diagnostics should be presented in summarise_draws? Would it make sense to have separate e.g. pareto_khat_quantity, pareto_khat_weights columns?

n-kall commented 10 months ago

I'm currently working on updating the pareto_, ess_ and mcse_ functions for weighted rvars in a fork

github-actions[bot] commented 10 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if e52a6f93d1dc05bcc37a1826395ca3c85a9e410d is merged into master:

:ballot_box_with_check:as_draws_array: 106ms -> 105ms [-0.92%, +0.11%]
:ballot_box_with_check:as_draws_df: 36.1ms -> 35.9ms [-2.07%, +0.99%]
:ballot_box_with_check:as_draws_list: 163ms -> 164ms [-0.29%, +1.09%]
:ballot_box_with_check:as_draws_matrix: 31.7ms -> 31.5ms [-2.09%, +0.54%]
:ballot_box_with_check:as_draws_rvars: 143ms -> 142ms [-1.41%, +0.47%]
:ballot_box_with_check:summarise_draws_100_variables: 713ms -> 712ms [-0.64%, +0.37%]
:ballot_box_with_check:summarise_draws_10_variables: 79.4ms -> 79ms [-0.96%, +0.04%] Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 10 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 052bb735cac2064426b1a645557cf2f6b29d4155 is merged into master:

:ballot_box_with_check:as_draws_array: 106ms -> 106ms [-0.54%, +1.12%]
:ballot_box_with_check:as_draws_df: 37.1ms -> 37.8ms [-0.96%, +4.32%]
:ballot_box_with_check:as_draws_list: 167ms -> 169ms [-1.59%, +4%]
:ballot_box_with_check:as_draws_matrix: 31.9ms -> 32ms [-0.52%, +1.19%]
:ballot_box_with_check:as_draws_rvars: 147ms -> 146ms [-2.44%, +1.08%]
:rocket:summarise_draws_100_variables: 743ms -> 722ms [-3.51%, -1.92%]
:ballot_box_with_check:summarise_draws_10_variables: 79.9ms -> 81.6ms [-2.64%, +6.86%] Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 10 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if b59df5e21583e55e99a438c8e9d09c6780af23ec is merged into master:

:ballot_box_with_check:as_draws_array: 108ms -> 108ms [-0.55%, +1.6%]
:ballot_box_with_check:as_draws_df: 39.1ms -> 38.9ms [-3.71%, +2.68%]
:ballot_box_with_check:as_draws_list: 176ms -> 175ms [-2.56%, +1.49%]
:ballot_box_with_check:as_draws_matrix: 33.4ms -> 33.7ms [-0.58%, +2.34%]
:ballot_box_with_check:as_draws_rvars: 156ms -> 156ms [-2.72%, +2.45%]
:ballot_box_with_check:summarise_draws_100_variables: 730ms -> 730ms [-0.96%, +0.91%]
:ballot_box_with_check:summarise_draws_10_variables: 80.2ms -> 80.4ms [-0.67%, +1.09%] Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 10 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if facc5864e908be28fc8ddebee86e9f1fd150358f is merged into master:

:ballot_box_with_check:as_draws_array: 108ms -> 108ms [-0.87%, +0.76%]
:ballot_box_with_check:as_draws_df: 37.3ms -> 37.3ms [-1.31%, +1.37%]
:ballot_box_with_check:as_draws_list: 173ms -> 175ms [-1.12%, +3.25%]
:ballot_box_with_check:as_draws_matrix: 32.5ms -> 32.4ms [-2.93%, +2.31%]
:ballot_box_with_check:as_draws_rvars: 150ms -> 151ms [-2.3%, +4.02%]
:ballot_box_with_check:summarise_draws_100_variables: 724ms -> 723ms [-0.76%, +0.5%]
:ballot_box_with_check:summarise_draws_10_variables: 80.7ms -> 80.9ms [-1.09%, +1.62%] Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 9 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if a8ac96e340419a71418ea5303fd4d4a0f97e5f23 is merged into master:

:ballot_box_with_check:as_draws_array: 105ms -> 104ms [-0.65%, +0.29%]
:rocket:as_draws_df: 86.1ms -> 83.7ms [-4.27%, -1.35%]
:ballot_box_with_check:as_draws_list: 174ms -> 173ms [-1.15%, +0.08%]
:ballot_box_with_check:as_draws_matrix: 30.4ms -> 30.4ms [-0.99%, +0.47%]
:exclamation::snail:as_draws_rvars: 84.9ms -> 85.9ms [+0.35%, +1.95%]
:exclamation::snail:summarise_draws_100_variables: 723ms -> 741ms [+2.14%, +2.82%]
:ballot_box_with_check:summarise_draws_10_variables: 80.1ms -> 80.4ms [-0.36%, +1.17%] Further explanation regarding interpretation and methodology can be found in the documentation.

mjskay commented 9 months ago

Okay, I think this is ready for review, pending two things:

Do we want to add an attribute distinguishing MC vs MCMC to this PR? I would suggest we wait and do that as a separate PR to address #239, as it will involve further discussion / thought, and this PR is already large.
Do we want to merge this PR to master and then merge @n-kall's branch on weighted diagnostics to master, or do we want to merge @n-kall's branch into this PR and then merge to master?

github-actions[bot] commented 9 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 88fa83dacdb4039da9d18a0be57cced099fd92b2 is merged into master:

:ballot_box_with_check:as_draws_array: 105ms -> 104ms [-0.82%, +0.14%]
:rocket:as_draws_df: 80.2ms -> 78.7ms [-2.84%, -1.06%]
:ballot_box_with_check:as_draws_list: 170ms -> 170ms [-0.66%, +0.56%]
:ballot_box_with_check:as_draws_matrix: 29.7ms -> 29.7ms [-0.59%, +0.86%]
:ballot_box_with_check:as_draws_rvars: 82.8ms -> 83.4ms [-0.77%, +2.16%]
:rocket:summarise_draws_100_variables: 721ms -> 718ms [-0.58%, -0.09%]
:rocket:summarise_draws_10_variables: 79.3ms -> 78.1ms [-1.75%, -1.26%] Further explanation regarding interpretation and methodology can be found in the documentation.

n-kall commented 9 months ago

Re: option 2, I'll still need some more time to finish up the weighted mcse and ess. So I think it's better to merge without waiting for me

paul-buerkner commented 9 months ago

I am at a conference and then on vacation for the next two weeks. Can someone else review this PR?

stan-dev / posterior

Weighted rvars #331

Summary

Demo

TODOs and Questions

Copyright and Licensing

Codecov Report