TuringLang / MCMCDiagnosticTools.jl

https://turinglang.org/MCMCDiagnosticTools.jl/dev
Other
19 stars 6 forks source link

Add function: `p_direction()` #90

Closed DominiqueMakowski closed 1 year ago

DominiqueMakowski commented 1 year ago

Following the discussion here https://github.com/TuringLang/MCMCChains.jl/pull/428, adding p_direction here defined for a ::AbstractArray{<:Union{Missing,Real}}

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage has no change and project coverage change: -1.33% :warning:

Comparison is base (3689384) 96.64% compared to head (acc75b7) 95.32%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #90 +/- ## ========================================== - Coverage 96.64% 95.32% -1.33% ========================================== Files 11 12 +1 Lines 865 877 +12 ========================================== Hits 836 836 - Misses 29 41 +12 ``` | [Files Changed](https://app.codecov.io/gh/TuringLang/MCMCDiagnosticTools.jl/pull/90?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=TuringLang) | Coverage Δ | | |---|---|---| | [src/MCMCDiagnosticTools.jl](https://app.codecov.io/gh/TuringLang/MCMCDiagnosticTools.jl/pull/90?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=TuringLang#diff-c3JjL01DTUNEaWFnbm9zdGljVG9vbHMuamw=) | `100.00% <ø> (ø)` | | | [src/p\_direction.jl](https://app.codecov.io/gh/TuringLang/MCMCDiagnosticTools.jl/pull/90?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=TuringLang#diff-c3JjL3BfZGlyZWN0aW9uLmps) | `0.00% <0.00%> (ø)` | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

sethaxen commented 1 year ago

Thanks for the PR! However, I disagree with @devmotion that this functionality belongs in this package. Within the ArviZ project at least, we draw a distinction between a diagnostic, which tells us something about the quality of inference using the given inference method (here, some MCMC method) and a statistic, which tells us something about the posterior (or something derived from the posterior such as a posterior predictive distribution). So for example, it would not make sense to include HDI, LOO, WAIC, or LOO-PIT in this package, even though most of these are also implemented in this org for Chains inputs, since these are statistics. Thus I would say the same about the p-direction.

I'm splitting out ArviZ's statistics implementations into their own ArviZStats package (currently a submodule here), and this could potentially belong there. But in ArviZ we're careful to only include methods that have been thoroughly compared with other methods so that we can make informed recommendations about how and when to use them. I've personally never used p-directions before, so I would need to read more about them and in particular see how they are used in practice and how they relate to and perform compared to other notions of Bayesian p-values, e.g. those proposed in Section 6 of BDA3.

yebai commented 1 year ago

Maybe we can add an src/contrib to store these less general diagnostics?

devmotion commented 1 year ago

Maybe it would be reasonable to not only split out ArvizStats in a standalone package but also generalize it to generic chains as MCMCDiagnosticTools (maybe MCMCStatistics would be a reasonable name), if that's not the case yet?

sethaxen commented 1 year ago

Maybe we can add an src/contrib to store these less general diagnostics?

For the reasons in https://github.com/TuringLang/MCMCDiagnosticTools.jl/pull/90#issuecomment-1656840393 I don't think this belongs here at all. It's not specific to MCMC, and it is a posterior analysis, not a diagnostic of either an inference method or of a model. This would fall in the category of Bayesian hypothesis testing, which would be a nice package on its own or in a larger package.

Maybe it would be reasonable to not only split out ArvizStats in a standalone package but also generalize it to generic chains as MCMCDiagnosticTools (maybe MCMCStatistics would be a reasonable name), if that's not the case yet?

ArviZStats is already generalized this way. In particular, every function has a method that operates on either arrays of Monte Carlo draws or an output of a previous analysis computed from MC draws. It does have convenience methods for InferenceObjects types, but these will be moved to an InferenceObjects-ArviZStats extension.

Nothing in ArviZStats is MCMC-specific (it depends on this package only to set reff for loo, which for non-MCMC should be estimated to be ~1). It's fine to use it with draws from a variational model or a GP posterior. So a name like "MCMCStats" would be too narrow. I'm certainly open to a more descriptive name though. For me the best reason to use "ArviZStats" as opposed to something more generic like "InferenceStats" or "MonteCarloStats" is that it reduces the expectation we include everything that might fall under that name and makes it easier to restrict our scope to methods widely useful in Bayesian workflows whose properties and pitfalls have been well-established. The corresponding Python package will also be called ArviZ_stats.

DominiqueMakowski commented 1 year ago

I must say that I do agree that this type of indices (also including other ones like region of practical equivalence - ROPE) are neither diagnostic indices nor specific to MCMC (and thus don't really fit here).

A package for Turing-model testing would make sense if the goal is to add more indices (including Bayes Factors), but at this stage if it's just for one function it might be overkill. Perhaps it could be taken step by step, i.e., first add that to say MCMCChains, and once there is need & effort to create that additional package we move the appropriate functions there?

sethaxen commented 1 year ago

Actually, @DominiqueMakowski, some of the ArviZ plots do use ROPE (e.g. plot_posterior), Bayes factors (plot_bf), and the Bayesian "p-values" in BDA3 (plot_bpv), so as we port these to Julia, we will need Julia implementations of these functions. In Python ArviZ, these are not included in the stats API, but we could include them in ArviZStats. In that case p_direction would make sense to add to ArviZStats (or whatever it is named).

devmotion commented 1 year ago

So a name like "MCMCStats" would be too narrow. I'm certainly open to a more descriptive name though. For me the best reason to use "ArviZStats" as opposed to something more generic like "InferenceStats" or "MonteCarloStats" is that it reduces the expectation we include everything that might fall under that name and makes it easier to restrict our scope to methods widely useful in Bayesian workflows whose properties and pitfalls have been well-established.

Maybe something like PosteriorStats would be an alternative that would also fit with e.g. GP posteriors. Generally, I think a descriptive name is fine even if it is a bit general - we could always make it clear in the docs and README what's the intended scope of the package.

sethaxen commented 1 year ago

Maybe something like PosteriorStats would be an alternative that would also fit with e.g. GP posteriors.

Done! https://github.com/arviz-devs/PosteriorStats.jl. ~Will register and then work on moving all InferenceObjects code to an extension.~ Will move the InferenceObjects code to an extension first, and then register.

yebai commented 1 year ago

Closed in favour of https://github.com/arviz-devs/PosteriorStats.jl/issues/10