TuringLang / AbstractMCMC.jl

Abstract types and interfaces for Markov chain Monte Carlo methods
https://turinglang.org/AbstractMCMC.jl
MIT License
74 stars 18 forks source link

Unhelpfull docstring for MCMCThreads #105

Open KronosTheLate opened 1 year ago

KronosTheLate commented 1 year ago

This is the current helpstring for MCMCThreads

help?> MCMCThreads
search: MCMCThreads

  MCMCThreads

  The MCMCThreads algorithm allows users to sample MCMC chains in parallel using
  multiple threads.

It would be extremely helpful if there was a method signature demonstrating how to use it. However, that functionality lives in Turing.jl, so I am not sure if adding that signature here even makes sense.

I would like it if something like the following could be added to the docstring:

To sample using multiple threads with `Turing.jl`, use Turing's `sample` function with the following signature:
sample(model, sampler, parallel_type, n, n_chains)

, where `parallel_type` is `MCMCThreads()`. `parallel_type ` could 
also be `MCMCDistributed()` for distributed sampling.

If this is to be done, it should be a coherent effort across similar functions such as MCMCDistributed().

devmotion commented 1 year ago

The docstring of sample (which BTW is not Turing-specific or owned by Turing) explains how it is used: https://github.com/TuringLang/AbstractMCMC.jl/blob/master/src/sample.jl#L71 Maybe we could just add a link to sample in the docstring (ie. add a See also [`sample`](@ref))?

Generally, the interface is described in much more detail in the online documentation of AbstractMCMC since there are many parts relevant for different functions and types, and to keep the size of the docstrings somewhat maneagable.

KronosTheLate commented 1 year ago

Yhea, the documentation in the docstring for sample is also not great, as you need to scroll through this wall of test to get to anything about MCMCThreads:

Current docstring for `sample` ``` help?> Turing.sample sample([rng], a, [wv::AbstractWeights]) Select a single random element of a. Sampling probabilities are proportional to the weights given in wv, if provided. Optionally specify a random number generator rng as the first argument (defaults to Random.GLOBAL_RNG). ────────────────────────────────────────────────────────────────────────────────────────────────── sample([rng], a, [wv::AbstractWeights], n::Integer; replace=true, ordered=false) Select a random, optionally weighted sample of size n from an array a using a polyalgorithm. Sampling probabilities are proportional to the weights given in wv, if provided. replace dictates whether sampling is performed with replacement. ordered dictates whether an ordered sample (also called a sequential sample, i.e. a sample where items appear in the same order as in a) should be taken. Optionally specify a random number generator rng as the first argument (defaults to Random.GLOBAL_RNG). ────────────────────────────────────────────────────────────────────────────────────────────────── sample([rng], a, [wv::AbstractWeights], dims::Dims; replace=true, ordered=false) Select a random, optionally weighted sample from an array a specifying the dimensions dims of the output array. Sampling probabilities are proportional to the weights given in wv, if provided. replace dictates whether sampling is performed with replacement. ordered dictates whether an ordered sample (also called a sequential sample, i.e. a sample where items appear in the same order as in a) should be taken. Optionally specify a random number generator rng as the first argument (defaults to Random.GLOBAL_RNG). ────────────────────────────────────────────────────────────────────────────────────────────────── sample([rng], wv::AbstractWeights) Select a single random integer in 1:length(wv) with probabilities proportional to the weights given in wv. Optionally specify a random number generator rng as the first argument (defaults to Random.GLOBAL_RNG). ────────────────────────────────────────────────────────────────────────────────────────────────── sample([rng, ]model, sampler, N; kwargs...) Return N samples from the model with the Markov chain Monte Carlo sampler. ────────────────────────────────────────────────────────────────────────────────────────────────── sample([rng, ]model, sampler, isdone; kwargs...) Sample from the model with the Markov chain Monte Carlo sampler until a convergence criterion isdone returns true, and return the samples. The function isdone has the signature isdone(rng, model, sampler, samples, state, iteration; kwargs...) where state and iteration are the current state and iteration of the sampler, respectively. It should return true when sampling should end, and false otherwise. ────────────────────────────────────────────────────────────────────────────────────────────────── sample([rng, ]model, sampler, parallel, N, nchains; kwargs...) Sample nchains Monte Carlo Markov chains from the model with the sampler in parallel using the parallel algorithm, and combine them into a single chain. ────────────────────────────────────────────────────────────────────────────────────────────────── sample([rng,] chn::Chains, [wv::AbstractWeights,] n; replace=true, ordered=false) Sample n samples from the pooled (!) chain chn. The keyword arguments replace and ordered determine whether sampling is performed with replacement and whether the sample is ordered, respectively. If specified, sampling probabilities are proportional to weights wv. │ Note │ │ If chn contains multiple chains, they are pooled (i.e., appended) before sampling. This │ ensures that even in this case exactly n samples are returned: │ │ julia> chn = Chains(randn(11, 4, 3)); │ │ julia> size(sample(chn, 7)) == (7, 4, 1) │ true ────────────────────────────────────────────────────────────────────────────────────────────────── sample( rng::AbstractRNG, h::Hamiltonian, κ::AbstractMCMCKernel, θ::AbstractVecOrMat{T}, n_samples::Int, adaptor::AbstractAdaptor=NoAdaptation(), n_adapts::Int=min(div(n_samples, 10), 1_000); drop_warmup::Bool=false, verbose::Bool=true, progress::Bool=false ) Sample n_samples samples using the proposal κ under Hamiltonian h. • The randomness is controlled by rng. • If rng is not provided, GLOBAL_RNG will be used. • The initial point is given by θ. • The adaptor is set by adaptor, for which the default is no adaptation. • It will perform n_adapts steps of adaptation, for which the default is the minimum of 1_000 and 10% of n_samples • drop_warmup controls to drop the samples during adaptation phase or not • verbose controls the verbosity • progress controls whether to show the progress meter or not ────────────────────────────────────────────────────────────────────────────────────────────────── sample(model::AdvancedHMC.DifferentiableDensityModel, kernel::AdvancedHMC.AbstractMCMCKernel, metric::AdvancedHMC.AbstractMetric, adaptor::AdvancedHMC.Adaptation.AbstractAdaptor, N::Integer; kwargs...) -> Any A convenient wrapper around AbstractMCMC.sample avoiding explicit construction of HMCSampler. ``` Only the second last entry explains parallel sampling, with the short sentence ``` sample([rng, ]model, sampler, parallel, N, nchains; kwargs...) Sample nchains Monte Carlo Markov chains from the model with the sampler in parallel using the parallel algorithm, and combine them into a single chain. ```

It does not mention MCMCThreads, so when I am visually scanning for "threads" I do not find anything. Also, the signatures fail to mention the types of anything. This means that reading such a signature requires the user to more or less know what usually goes where - especially as some of the arguments are custom types of the package.

So linking to sample would be a good part of the problem - the other would be to make the docstring for sample more readable. But there are so many signatures that they need some cleaning. I generally prefer something like the following:

Proposed docstring for `sample` ``` help?> Turing.sample Sampling from an array: sample(a, [wv::AbstractWeights]) sample(a, [wv::AbstractWeights], n::Integer; replace=true, ordered=false) sample(a, [wv::AbstractWeights], dims::Dims; replace=true, ordered=false) Select a single random element of a. Sampling probabilities are proportional to the weights given in wv, if provided. `n` determines the sample size. `replace` dictates whether sampling is performed with replacement. `ordered` dictates whether an ordered sample (also called a sequential sample, i.e. a sample where items appear in the same order as in a) should be taken. `dims` specifies the dimensions of the output array. Omitting `a` is equivalent to setting `a` equal to Base.OneTo(length(wv)) ────────────────────────────────────────────────────────────────────────────────────────────────── Sampling from a model: sample(model, sampler, N; kwargs...) sample(model, sampler, isdone; kwargs...) sample(model, sampler, parallel, N, nchains; kwargs...) Sample `N` samples based on the model `model`, with the Markov chain Monte Carlo sampler. If convergence criterion `isdone` is supplied, sample until it returns true. If `parallel` and `nchains` are provided, sample `nchains` separate chains. This is done in parallel if `parallel` is `MCMCThreads()`, and in a distributed manner if `parallel` is `MCMCDistributed()` The function `isdone` should have a signature `isdone(rng, model, sampler, samples, state, iteration; kwargs...)` where state and iteration are the current state and iteration of the sampler, respectively. It should return true when sampling should end, and false otherwise. For all signatures, optionally specify a random seed as the first argument (defaults to Random.GLOBAL_RNG). ────────────────────────────────────────────────────────────────────────────────────────────────── sample([rng,] chn::Chains, [wv::AbstractWeights,] n; replace=true, ordered=false) Sample n samples from the pooled (!) chain chn. The keyword arguments replace and ordered determine whether sampling is performed with replacement and whether the sample is ordered, respectively. If specified, sampling probabilities are proportional to weights wv. │ Note │ │ If chn contains multiple chains, they are pooled (i.e., appended) before sampling. This │ ensures that even in this case exactly n samples are returned: │ │ julia> chn = Chains(randn(11, 4, 3)); │ │ julia> size(sample(chn, 7)) == (7, 4, 1) │ true ────────────────────────────────────────────────────────────────────────────────────────────────── sample( rng::AbstractRNG, h::Hamiltonian, κ::AbstractMCMCKernel, θ::AbstractVecOrMat{T}, n_samples::Int, adaptor::AbstractAdaptor=NoAdaptation(), n_adapts::Int=min(div(n_samples, 10), 1_000); drop_warmup::Bool=false, verbose::Bool=true, progress::Bool=false ) Sample n_samples samples using the proposal κ under Hamiltonian h. • The randomness is controlled by rng. • If rng is not provided, GLOBAL_RNG will be used. • The initial point is given by θ. • The adaptor is set by adaptor, for which the default is no adaptation. • It will perform n_adapts steps of adaptation, for which the default is the minimum of 1_000 and 10% of n_samples • drop_warmup controls to drop the samples during adaptation phase or not • verbose controls the verbosity • progress controls whether to show the progress meter or not ────────────────────────────────────────────────────────────────────────────────────────────────── sample(model::AdvancedHMC.DifferentiableDensityModel, kernel::AdvancedHMC.AbstractMCMCKernel, metric::AdvancedHMC.AbstractMetric, adaptor::AdvancedHMC.Adaptation.AbstractAdaptor, N::Integer; kwargs...) -> Any A convenient wrapper around AbstractMCMC.sample avoiding explicit construction of HMCSampler. ```
devmotion commented 1 year ago

Yes, probably it would be useful to add the signatures to the docstrings. We could use DocStringExtensions.jl to ensure that they are correct and simplify updates. Would you like to make a PR?

Unfortunately, the REPL output for ?sample will always be messy because it's neither owned by AbstractMCMC nor Turing but implemented and documented by many different packages. Also downstream packages such as Turing or AdvancedHMC might want to add additional docstrings for sample with their samplers if they allow custom keyword arguments.

You can obtain only the docstring of interest by searching more specifically for a certain function calls or signatures instead of the generic sample. For that reason usually I think it's better to not always merge docstrings (and also not in this case here) since it wouldn't be possible to retrieve docstrings for the different signatures separately anymore.

KronosTheLate commented 1 year ago

Yes, probably it would be useful to add the signatures to the docstrings. We could use DocStringExtensions.jl to ensure that they are correct and simplify updates. Would you like to make a PR?

No, sorry. I do not feel competent enough, and I do not have time right now to figure out the best way of doing this.

You can obtain only the docstring of interest by searching more specifically for a certain function calls or signatures instead of the generic sample.

I did not know this! That does change things - having them split apart makes more sense in that case.