cscherrer / Tilde.jl

WIP successor to Soss.jl
MIT License
74 stars 1 forks source link

Updated BPS example #18

Closed mschauer closed 2 years ago

mschauer commented 2 years ago

Should work now after updating your ZZB

mschauer commented 2 years ago

@sethaxen Pathfinder, Tilde and ZigZagBoomerang nicely working as team :-) We think of comparing times with some other things out there

sethaxen commented 2 years ago

@sethaxen Pathfinder, Tilde and ZigZagBoomerang nicely working as team :-) We think of comparing times with some other things out there

Nice! Anything you need on the Pathfinder side?

mschauer commented 2 years ago
struct LWrapper{T}
    Σ::T
end
M = LWrapper(pf_result.fit_distribution.Σ)
x0 = pf_result.fit_distribution.μ;
using Pathfinder.PDMats
function ZigZagBoomerang.reflect!(∇ϕx, x, θ, F::BouncyParticle) # Seth's version
    z = F.L.Σ * ∇ϕx
    θ .-= (2*dot(∇ϕx, θ)/dot(∇ϕx, z)) * z
    θ
end
function ZigZagBoomerang.refresh!(rng, θ, F::BouncyParticle)
    ρ̄ = sqrt(1-F.ρ^2)
    θ .*= F.ρ
    u = ρ̄*PDMats.unwhiten(F.L.Σ, randn(rng, length(θ)))
    θ .+= u
    θ
end

Hot-Patch to allow to use woodbury decomposition given by Pathfinder, https://github.com/sethaxen/Pathfinder.jl/issues/77

sethaxen commented 2 years ago

Hot-Patch to allow to use woodbury decomposition given by Pathfinder, sethaxen/Pathfinder.jl#77

Very nice! You can eek out a little more performance with this implementation:

function ZigZagBoomerang.reflect!(∇ϕx, x, θ, F::BouncyParticle)
    z = F.L.Σ * ∇ϕx
    θ .-= (2*dot(∇ϕx, θ)/dot(∇ϕx, z)) * z
    θ
end
sethaxen commented 2 years ago

@mschauer what is LWrapper meant to represent? Is it $L$, such that $L L^\top = \Sigma^{-1}$?

mschauer commented 2 years ago

Yeah, but I am not sure it’s a very sensible design right now, I have to look how dynamichmc etc deal with metrics

mschauer commented 2 years ago

Are we happy with the design from ZZB side for this use case? Then I'll "freeze" that into a type while adding support for metrics.

cscherrer commented 2 years ago

A few minor concerns, but it might be just misunderstanding ZZB:

I think there was a question about passing the inverse mass instead of the mass matrix, so you don't have to invert it each time. Is this possible?

Second, there are some ignored parameters here, can those be removed? What's the minimal information required from the user - are there reasonable defaults?

Z = BouncyParticle(∅, ∅, # ignored
    2.0, # momentum refreshment rate 
    0.95, # momentum correlation / only gradually change momentum in refreshment/momentum update
    0.0, # ignored
    M # cholesky of momentum precision
) 
mschauer commented 2 years ago

Yes, so I would like to make a version BouncyParticle without these inconsistency, so I would like to know if just fixing these two issues would do

sethaxen commented 2 years ago

Very nice! You can eek out a little more performance with this implementation:

I'm pretty sure we can also implement a non-allocating dot(::AbstractVecOrMat, ::WoodburyPDMat, ::AbstractVecOrMat) method, but it's only worth doing so if F.L.Σ * ∇ϕx is a computational bottleneck.

mschauer commented 2 years ago

Looks like all time is spend is taking derivatives and gradients of the objective, as it should, and it looks like a bit of overhead in

https://github.com/cscherrer/Tilde.jl/blob/c843320d00a50852031a25c9e48b940133332bd9/src/primitives/logdensity.jl#L16

(maybe because of the unstable return?)

cscherrer commented 2 years ago

Thanks @mschauer . Since the transform guarantees we're in the support, I think we should be using unsafe_logdensityof for this case

mschauer commented 2 years ago

Fixed a type instability issue and use unsafe_logdensitityof. Check the timings... Needs newly tagged version of ZZB.

mschauer commented 2 years ago

I am done for now, with introduction of PDMats support, for now I will not refactor the struct BouncyParticle.