TuringLang / AbstractMCMC.jl

Abstract types and interfaces for Markov chain Monte Carlo methods
https://turinglang.org/AbstractMCMC.jl
MIT License
78 stars 18 forks source link

Why `step` initializes the chain? #135

Open Red-Portal opened 8 months ago

Red-Portal commented 8 months ago

Hi,

Under the current API, one implements two specializations of step: one for initializing the chain, and one for actually "stepping". But why didn't we make a separate interface for the first use like initialize_state or something? It seems that, partly due to the naming, what the first call to step actually does seem to differ across implementations. For example, AdvancedHMC does an actual transition in the first call, while others like EllipticalSliceSampling doesn't. For some downstream use-cases of AbstractMCMC where step is called just once at a time, I think this underspecification is not ideal.

Also, is there a reason why init_params became optional?

devmotion commented 8 months ago

Isn't this only a question of naming? Also with something like initialize_state implementations should still be the same, shouldn't they?

Also, is there a reason why init_params became optional?

It is not supported by the default implementation in AbstractMCMC anymore (downstream packages can in principle support arbitrary keyword arguments but if they do swapping samplers and generic implementations in other packages become more difficult). It was renamed to initial_params.

Red-Portal commented 8 months ago

It is not supported by the default implementation in AbstractMCMC anymore (downstream packages can in principle support arbitrary keyword arguments but if they do swapping samplers and generic implementations in other packages become more difficult). It was renamed to initial_params

Oh sorry, I was actually asking about initial_params

Isn't this only a question of naming? Also with something like initialize_state implementations should still be the same, shouldn't they?

Yes, I think the naming is not ideal here because it is not clear what the first call to step should do. Wouldn't it be better to predict the behavior by just looking at the function name? The fact that we already have a discrepancy between implementations is unfortunate. For instance, in my current use-case, I want to perform one transition at a time, which would be implemented by calling step twice, but AdvancedHMC ends up doing two transitions while EllipticalSliceSampling does just one.

devmotion commented 8 months ago

but AdvancedHMC ends up doing two transitions while EllipticalSliceSampling does just one.

IMO this is a bug in AdvancedHMC. I think the expected behaviour is as in EllipticalSliceSampling and e.g. AdvancedMH (https://github.com/TuringLang/AdvancedMH.jl/blob/3749df0075ae59fc8bc4ef576a73ce770b75eec0/src/mh-core.jl#L73-L86).

Red-Portal commented 8 months ago

Wait I'm a little bit confused here. So AdvancedMH and AdvancedHMC both seem to do a transition on the first call to step, but it seems like only EllipticalSliceSampling does not? Is ESS really not the one that is inconsistent here?

devmotion commented 8 months ago

AdvancedMH just wraps the initial sample in a state, it does not perform any "transition".

Red-Portal commented 8 months ago

Oh I got it. I got distracted by the name transition.

Okay then, it seems there isn't anything preventing AdvancedHMC to do it that way. (In fact the name step kindda makes it feel a transition should happen IMO.) So I think having a different name would be better specification, but if that's too big of a break, I think the docs should be more specific about the expected behavior here.