TuringLang / MCMCChains.jl

Types and utility functions for summarizing Markov chain Monte Carlo simulations
https://turinglang.org/MCMCChains.jl/
Other
266 stars 29 forks source link

DIC example is broken #270

Open rikhuijzer opened 3 years ago

rikhuijzer commented 3 years ago

Currently, MCMCChains contains a broken DIC example, see https://github.com/TuringLang/MCMCChains.jl/issues/267 for more information.

rikhuijzer commented 3 years ago

Upon second read, this issue isn't very clear. It is about the untested usage example at

https://github.com/TuringLang/MCMCChains.jl/blob/b0da9f1800b3b96cdb4f3f8dbb4fef1db7bd35bd/src/modelstats.jl#L17-L31

robertfeldt commented 3 years ago

Hmm, I'm thinking about adding more model selection / IC options to MCMCChains but I'm not fully clear about the current API. Wouldn't it make more sense with an interface like:

dic(model::DynamicPPL.Model, chains::Chains)

and then implement a logpdf method based on pointwise_loglikelihoods that takes a model and a chains value:

import Distributions: logpdf
function logpdf(model::DynamicPPL.Model, chain::Chains)
    logliks_per_input = pointwise_loglikelihoods(model, chain)
    Float64[mean(lls) for (_, lls) in logliks_per_input]
end

?

It seems to me one can then implement waic and maybe even loo with a similar interface. Comments welcome, maybe I'm missing something fundamental so stop me before I go ahead implementing this... ;)

devmotion commented 3 years ago

The fundamental problem is that MCMCChains should not depend on DynamicPPL. But I assume one could/should work around this problem by moving some of the interface in DynamicPPL to AbstractMCMC which (of course) MCMCChains already depends on.

robertfeldt commented 3 years ago

Ok, I see and it kind of makes sense.

So maybe a logpdf function making method makes sense then and this could be added to either Turing.jl, to DynamicPPL, or to a new package ModelSelection.jl?

function make_logpdf_func(model::DynamicPPL.Model)
    function logpdffunc(c::Chains) 
        Float64[mean(lls) for (_, lls) in pointwise_loglikelihoods(model, c)]
    end
    logpdffunc
end

and then it can be used with MCMCChains like:

dic(chains, make_logpdf_func(model))
waic(chains, make_logpdf_func(model))

etc.

goedman commented 3 years ago

For psisloo() and waic() wouldn't you need the pointwise_loglikelihoods(model, c) matrix? Not sure for wdic.

I really like the name ModelSelection.jl, maybe in JuliaStats?

devmotion commented 3 years ago

aic etc. are defined in StatsBase: https://github.com/JuliaStats/StatsBase.jl/blob/2faa6e80b7966b915086d8cd5a4a4d89a2126db5/src/statmodels.jl#L192 I think dic etc. should be defined in a similar way if possible: there's some lightweight functional interface (such as loglikelihood) and if it is implemented for a model then aic etc. work automatically.

devmotion commented 3 years ago

BTW just recently also a pointwise loglikelihood loglikelihood(model, observation) (with loglikelihood(model, :) for all observations) was added: https://github.com/JuliaStats/StatsBase.jl/pull/685

goedman commented 3 years ago

David, is your thinking to have something like an MCMCModel <: StatisticalModel?

devmotion commented 3 years ago

Yes, and then one would just make sure that StatsBase.loglikelihood(::MCMCModel) etc. are implemented. Since e.g. in DynamicPPL we don't want to subtype StatisticalModel (there's already a different supertype), one could maybe define a way to construct a MCMCModel from e.g. DynamicPPL.Model and mainly use it internally (e.g. dic(model::DynamicPPL.Model) = dic(MCMCModel(model)). Maybe it could even just be a wrapper of other models such as DynamicPPL.Model. Maybe even DynamicPPL could define its custom StatisticalModel construct and it would not have to be unified across packages.

ParadaCarleton commented 3 years ago

aic etc. are defined in StatsBase: https://github.com/JuliaStats/StatsBase.jl/blob/2faa6e80b7966b915086d8cd5a4a4d89a2126db5/src/statmodels.jl#L192 I think dic etc. should be defined in a similar way if possible: there's some lightweight functional interface (such as loglikelihood) and if it is implemented for a model then aic etc. work automatically.

DIC or WAIC? I'm of the opinion that DIC probably shouldn't be implemented at all, unless you're using it for pedagogical purposes I guess. It's kind of a hacked-together tool that only took off because it's about a third of the way to being Bayesian and it got added to BUGS. DIC isn't transformation-invariant, it only works for normal distributions, it uses point estimates rather than the full posterior, and it's got even more problems besides that. On the other hand, I think implementing WAIC and WBIC in StatsBase is a good idea and am all for it.

The thing I don't think is quite as good of an idea is implementing PSIS-LOO in StatsBase, for a couple reasons. First, it's quite a heavy -- I'm trying to build an implementation right now and it's definitely taking a while. (Anyone who wants to lend a hand is welcome to help! Message me on the Slack.) If you take a look at the loo package by the Stan team in R, it takes one file to implement WAIC and something like 7 or 8 to implement PSIS-LOO. The opposite holds too -- I don't think people should have to import all of StatsBase just to get PSIS-LOO. The second thing is that PSIS isn't inherently only applicable to cross validation -- it's just a modified version of importance sampling, which you can use anywhere you'd use importance sampling. It would actually make more sense to implement it as a Turing sampling algorithm, I think.

robertfeldt commented 3 years ago

Thanks for all your input. I'll be holding off on PSIS-LOO then, hearing that there is already ongoing work on it. First focus is to explore good/simple ways to add WAIC and possible WBIC to StatsBase and make them useful with MCMCChains + DynamicPPL.Model as discussed above.

ParadaCarleton commented 3 years ago

Thanks for all your input. I'll be holding off on PSIS-LOO then, hearing that there is already ongoing work on it. First focus is to explore good/simple ways to add WAIC and possible WBIC to StatsBase and make them useful with MCMCChains + DynamicPPL.Model as discussed above.

I'd check out the discussion here for implementation tips on avoiding confusion from WBIC, e.g. by referring to it as WSIC and making sure documentation explains the differing goals of WSIC and WAIC (Picking the correct model vs. picking the model with best expected fit).

If you're interested in adding WSIC/WBIC/PSIS-LOO functions to Julia, feel free to reach out to me on Slack!