TuringLang / MCMCDiagnosticTools.jl

https://turinglang.org/MCMCDiagnosticTools.jl/dev
Other
19 stars 6 forks source link

Support shapes `(draw, [chain[, params...]])` #79

Closed sethaxen closed 1 year ago

sethaxen commented 1 year ago

Following #78, this updates rstar, mcse, ess, ess_rhat, and rhat to support inputs of shape (draw, [chain[, params...]]).

To-do

Example

julia> using MCMCDiagnosticTools, DimensionalData

julia> x = DimArray(randn(100, 4, 2, 3), (:draw, :chain, :param1, :param2));

julia> ess_rhat(x)
(ess = [326.4145723262936 397.6331512320726 506.0911142499094; 356.92747547673224 383.59052377484284 418.4186188480045], rhat = [1.0148252570867573 1.000971546621137 0.9980968909966238; 0.9996616822813973 0.9988594415752071 1.0008960208353308])

julia> ess(x)
2×3 DimArray{Float64,2} with dimensions: Dim{:param1}, Dim{:param2}
 326.415  397.633  506.091
 356.927  383.591  418.419

julia> rhat(x)
2×3 DimArray{Float64,2} with dimensions: Dim{:param1}, Dim{:param2}
 1.01483   1.00097   0.998097
 0.999662  0.998859  1.0009

julia> mcse(x)
2×3 DimArray{Float64,2} with dimensions: Dim{:param1}, Dim{:param2}
and reference dimensions: Dim{:draw}, Dim{:chain}
 0.0577637  0.0505631  0.0427255
 0.0487027  0.0511541  0.0472727

julia> mcse(x[:, :, :, 1])
2-element DimArray{Float64,1} with dimensions: Dim{:param1}
and reference dimensions: Dim{:param2}, Dim{:draw}, Dim{:chain}
 1  0.0577637
 2  0.0487027

julia> mcse(x[:, :, 1, 1])
0.05776370534164593

julia> mcse(x[:, 1, 1, 1])
0.12621894071310816
github-actions[bot] commented 1 year ago

Pull Request Test Coverage Report for Build 4693721137


Totals Coverage Status
Change from base Build 4399022657: 0.08%
Covered Lines: 830
Relevant Lines: 859

💛 - Coveralls
codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.08 :tada:

Comparison is base (304ecb2) 96.56% compared to head (253c911) 96.64%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #79 +/- ## ========================================== + Coverage 96.56% 96.64% +0.08% ========================================== Files 11 11 Lines 844 865 +21 ========================================== + Hits 815 836 +21 Misses 29 29 ``` | [Impacted Files](https://app.codecov.io/gh/TuringLang/MCMCDiagnosticTools.jl/pull/79?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=TuringLang) | Coverage Δ | | |---|---|---| | [src/ess\_rhat.jl](https://app.codecov.io/gh/TuringLang/MCMCDiagnosticTools.jl/pull/79?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=TuringLang#diff-c3JjL2Vzc19yaGF0Lmps) | `100.00% <100.00%> (ø)` | | | [src/mcse.jl](https://app.codecov.io/gh/TuringLang/MCMCDiagnosticTools.jl/pull/79?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=TuringLang#diff-c3JjL21jc2Uuamw=) | `100.00% <100.00%> (ø)` | | | [src/rstar.jl](https://app.codecov.io/gh/TuringLang/MCMCDiagnosticTools.jl/pull/79?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=TuringLang#diff-c3JjL3JzdGFyLmps) | `100.00% <100.00%> (ø)` | | | [src/utils.jl](https://app.codecov.io/gh/TuringLang/MCMCDiagnosticTools.jl/pull/79?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=TuringLang#diff-c3JjL3V0aWxzLmps) | `100.00% <100.00%> (ø)` | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

sethaxen commented 1 year ago

I think it's a reasonable and useful generalization, I mainly have some minor concerns regarding the implementation.

A potentially more flexible approach is for our methods to accept a dims keyword representing sample dimension(s), with shape (draw_dim[, chain_dim]) or draw_dim, with all other dimensions interpreted as param dimensions If something like _sample_dims was used to set the defaults, and we added this to the API, then arbitrary ordering of dimensions could be supported, and packages could even overload this method if the type of the array informs a different set of defaults.

But I think this would be slightly more complicated to support. e.g. _eachparam pre-v1.9 would first need to use PermutedDimsArray to bring the param dimensions together and then eachslice.

EDIT: and then it might be jarring that we drop the dims as opposed to the estimators in Statistics/StatsBase.

EDIT EDIT: but I suppose we could follow eachslice in v1.9 and add a drop=true keyword to control this.

devmotion commented 1 year ago

I just ran into an issue in a package where it would be useful to have this functionality (and e.g., be able to call mcse with vectors of draws again). Apart from the last comments, it seems this PR is ready?

sethaxen commented 1 year ago

Apart from the last comments, it seems this PR is ready?

Just about, I think. Will take a closer look later today.

sethaxen commented 1 year ago

One suggestion would be to also add an internal function for flattening zero-dimensional arrays to scalars, something like

Done!

Are there other functions that should be generalized?

In principle all of the older diagnostics could be generalized. But for the reasons given in https://github.com/TuringLang/MCMCDiagnosticTools.jl/pull/82#issuecomment-1556268129, I would consider this lower priority so that it shouldn't hold up this PR. if we choose to do this, this could be its own PR.

coveralls commented 4 months ago

Pull Request Test Coverage Report for Build 4699947591

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details


Totals Coverage Status
Change from base Build 4399022657: 0.08%
Covered Lines: 832
Relevant Lines: 861

💛 - Coveralls