TuringLang / Turing.jl

Bayesian inference with probabilistic programming.
https://turinglang.org
MIT License
2.03k stars 218 forks source link

Implementing `adtype` for closely linked "external" samplers / adding metric warm-up #2248

Closed SamuelBrand1 closed 3 months ago

SamuelBrand1 commented 3 months ago

Hi everyone,

Being able to add external samplers is a really great feature of Turing, however, it seems that its at odds with the shift towards setting the AD backend via selecting a type from ADTypes as a kwarg to the sampler?

For example, in the documentation you give an example of treating the AdvancedHMC.NUTS wrapper to HMC as an external sampler for the purpose of using Pathfinder.jl to pre-heat the metric https://turinglang.org/docs/tutorials/docs-16-using-turing-external-samplers/#going-beyond-the-turing-api

However, this is the same sampler that is used by Turing.NUTS, albeit without the syntactic sugar of having the adtype kwarg, which implements AD backend with the LogDensityProblemsAD.jl interface here (I think...).

Obviously, I could do this with full AdvancedHMC but wouldn't it be preferable to either include adtype as an option for AdvancedHMC.NUTS or include setting the metric (rather than just metric type) in Turing.NUTS. At the moment, its a bit clunky with how you implement whichever options.

SamuelBrand1 commented 3 months ago

Actually I think externalsampler does this already e.g.

https://github.com/TuringLang/Turing.jl/blob/ebc36af633f103e4a25050eb068c08bc14d10de9/src/mcmc/Inference.jl#L126C1-L140C4

Maybe make that clearer in the docs?

torfjelde commented 3 months ago

Actually I think externalsampler does this already e.g.

Aye, this was a recent "fix" so haven't made it's way to the docs yet.

And more generally wrt. the "overarching" question: all the work on externalsampler and other aspects are parts of a larger push to completely remove the Turing.NUTS samplers, etc. in favour of the corresponding packages' own samplers. That way we can just focus on making the "true" sampler constructor nice to work with rather than having to deal with discrepancies like what you describe here. So for now, your best bet is just using AdvancedHMC with externalsampler (as you noticed, you can indeed set the adtype), and sometime in the not-too-far-off-future the clunkiness will be gone:)