TuringLang / AbstractMCMC.jl

Abstract types and interfaces for Markov chain Monte Carlo methods
https://turinglang.org/AbstractMCMC.jl
MIT License
87 stars 18 forks source link

Default reduction for parallel sampling #44

Closed devmotion closed 2 years ago

devmotion commented 4 years ago

I just ran into some annoying problem while adding some tests for parallel sampling in https://github.com/TuringLang/EllipticalSliceSampling.jl/pull/8.

In the simple test examples, the sampler just returns a vector of samples (the default for regular sampling) which are either scalars or vectors with one element. Unfortunately, parallel sampling does not work due to https://github.com/TuringLang/AbstractMCMC.jl/blob/f03a17d63fa794413cbf620ecc7ba46fba9b480b/src/sample.jl#L252-L253 (and similar lines) and chaincat being only defined for AbstractChains.

Now what should we do if the sampling does not return an object of type AbstractChains? Or what should we do in general?

Maybe the user could pass a reduction function which would default to AbstractMCMC.DEFAULT_REDUCTION, defined as

DEFAULT_REDUCTION(x::Vector{Any}) = map(identity, x)
DEFAULT_REDUCTION(x) = x

just to make sure that we return a concretely typed result if possible.

But of course that would mean that by default you would always get back a vector of chains (of whatever type), which would be quite breaking. But maybe that's not too bad and actually more intuitive?

cpfiffer commented 4 years ago

I would make chainscat(x::Vector{Any}...) default to a vector-of-vectors. chainscat(x::AbstractChains...) should be required to implement a specific functional form. I don't think we need an additional reduction -- that's the purpose of chainscat.

devmotion commented 4 years ago

But that would mean that we could never avoid splatting, even if we would like to use reduce since we will always have to split the vector of chains when we all chainscat(...). Just defining chainscat on vectors doesn't seem to help either since then we can't distinguish between vectors of arrays and vectors of AbstractChains since in multithreaded sampling the vector is always of type Vector{Any}.

That's why I thought at some point, maybe we should just return the vectors and let users/downstream packages deal with the reduction (or possibly provide some reduction function directly).

devmotion commented 2 years ago

Seems this was addressed in #45.