Manual derivatives with DynamicHMC

chenwilliam77 commented 5 years ago

Hi Tamas,

I've been trying to use DynamicHMC with a manually computed gradient, and I'm running into the following issue. The most recent tagged release of DynamicHMC requires LogDensityProblems v0.8.3, but the API for manually-computed gradients appear to exist only in v0.9.1. Of course, this runs into the issue that ValueGradient appears to be missing from v0.9.1, and DynamicHMC uses ValueGradient in its current implementation.

While I only took a brief look at the current source code for DynamicHMC, I wasn't able to immediately determine whether or not there exists a way to use DynamicHMC w/manually computed gradients. Does this functionality just not exist yet?

FYI, I'm using a manually computed gradient because automatic differentiation won't work for the likelihood functions I'm testing (one requires running a minimization routine, and another applies a generalized Schur decomposition).

Best,

William

tpapp commented 5 years ago

Hi William,

The most recently tagged version of DynamicHMC is 2.1, which relies on LogDensityProblems 0.9.*, as all versions above 2.0. See

https://github.com/tpapp/DynamicHMC.jl/blob/ace360a137adfb9fa3c3cf291910ae21c4f68fda/Project.toml

Perhaps you have an older version pinned? Make sure you upgrade to 2.1. (This DynamicHMCExamples.jl repo is not updated yet, so if you have it installed, perhaps it is holding DynamicHMC back).

Using manually computed gradients is of course possible, and was in fact my main motivation for writing DynamicHMC. See https://tamaspapp.eu/LogDensityProblems.jl/dev/#Manually-calculated-derivatives-1

Please let me know if this solves your problem, and feel free to ask more questions any time.

(Are you by any chance solving DSGE models?)

chenwilliam77 commented 5 years ago

Hi Tamas,

The issue was that I was using Julia 1.0. Taking a closer look, Dynamic HMC v2.1 runs on Julia 1.1+. However, I think I'm still setting the problem up wrong. I've followed the documentation for setting up a manually calculated gradient according to LogDensityProblems. I think the error has to do with what I'm passing mcmc_with_warmup. I've tested logdens and gradient code, and they both spit out what look like reasonable numbers, so I don't think the issue is there.

This is the code I run after setting up the gradient.

p = EntryGameProblem(zeros(2,2))
params = (1., 1., 1., 1.)
p((beta_1 = params[1], ...))
t = problem_transformation(p)
P = TransformedLogDensity(t, p)
results = mcmc_with_warmup(Random.GLOBAL_RNG, P, 1000)

However, I get the following error:

ERROR: MethodError: no method matching logdensity_and_gradient(::TransformedLogDensity
{TransformVariables.TransformTuple{NamedTuple{(:beta_1, :beta_2, :gamma_1, :gamma_2),
NTuple{4,TransformVariables.ScaledShiftedLogistic{Int64}}}},
EntryGamePrblem{Array{Float64,2}}}, ::Array{Float64,1})

Closest candidates are:
    logdensity_and_gradient(::EntryGameProblem, ::Any)

I presume I'm not supposed to pass in P, but then what should I be passing in?

And to answer your last question, yes, the ultimate intent is to solve DSGE models. Currently, I'm testing on non-DSGE models just to make sure I have the syntax correct before I start writing code to use HMC with a generic DSGE model.

tpapp commented 5 years ago

Can you please include a complete, self-contained MWE I can try to run? It would make it much easier to help you.

Are you now using 1.1, or preferably 1.2? (there is no reason not to upgrade in most cases, but 1.1 should be fine).

chenwilliam77 commented 5 years ago

Apologies! Here's a MWE. I am using Julia 1.1.

using SpecialFunctions
using ForwardDiff, Distributions, Optim
using TransformVariables, LogDensityProblems, DynamicHMC, MCMCDiagnostics,
   Parameters, Statistics, Random

# Set up likelihood and gradient functions
function quasi_likelihood(θ)
    ϕ = [.12; .07; .47]
    ψ = 5000

    β₁ = θ[1]
    β₂ = θ[2]
    γ₁ = θ[3]
    γ₂ = θ[4]

    normaldist = Normal(0,1)
    m1 = cdf(normaldist, β₁)#.value)
    m2 = cdf(normaldist, β₂)#).value)
    d1 = cdf(normaldist, β₁ - γ₁)#.value - γ₁.value)
    d2 = cdf(normaldist, β₂ -γ₂)#.value - γ₂.value)

    @inline function G(α)
        return [d1*d2; (1 - m1)*(1 - m2); m1*(1 - d2) - α]
    end

    objfun(α) = abs(sum((ϕ - G(α)).^2))
    α_bar = (m1-d1) * (m2-d2)
    if α_bar < 0
        return -Inf
    end
    res = optimize(objfun, 0, α_bar)
    return -0.5 * ψ * minimum(res)
end
function gradient_entry_game(para)
    function n_cdf(x)
        return 0.5 * erfc(-x/sqrt(2))
    end
    function n_pdf(x)
        return 1/sqrt(2*π)*exp(-0.5* x^2)
    end
    beta1, gamma1, beta2, gamma2 = para
    ϕ = [.12; .07; .47]
    ψ = 5000
    m1 = n_cdf(beta1)
    m2 = n_cdf(beta2)
    d1 = n_cdf(beta1-gamma1)
    d2 = n_cdf(beta2-gamma2)

    function G(alpha)
        gvec = [d1*d2, (1-m1)*(1-m2), m1*(1-d2) - alpha]
        return gvec
    end

    objfun(α) = sum((ϕ - G(α)).^2)
    α_bar = (m1-d1) * (m2-d2)
    if α_bar < 0
        return -100000000000000. * ones(4)
    end
    if ϕ[3] - m1*(1-d2) > 0
        alpha_star = 0.0
    else
        alpha_star = min((m1*(1-d2)-ϕ[3], alpha_bar) )
    end

    dG_dbeta1 = [d2*n_pdf(beta1-gamma1), -(1-m2)*n_pdf(beta1), (1-d2)*n_pdf(beta1)]
    dG_dgamma1 = [-d2*n_pdf(beta1-gamma1), 0, 0]
    dG_dbeta2 = [d1*n_pdf(beta2-gamma2), -(1-m1)*n_pdf(beta2), -m1*n_pdf(beta2-gamma2)]
    dG_dgamma2 = [-d1*n_pdf(beta2-gamma2), 0, m1*n_pdf(beta2-gamma2)]

    Gbar = ϕ - G(alpha_star)
    grad = ψ*[dot(Gbar, dG_dbeta1),
                dot(Gbar, dG_dgamma1),
                dot(Gbar, dG_dbeta2),
                dot(Gbar, dG_dgamma2)]
    return grad
end

# Set up log density problem
struct EntryGameProblem{TX<:AbstractMatrix}
    X::TX
end

function (entry_game_problem::EntryGameProblem)(θ)
    β₁ = θ[1]
    β₂ = θ[2]
    γ₁ = θ[3]
    γ₂ = θ[4]
    return quasi_likelihood(θ) + sum(logpdf.(Uniform(0.,2.), γ₁) +
                                        logpdf.(Uniform(0.,2.), γ₂) +
                                        logpdf.(Uniform(-2.,4.), β₁) +
                                        logpdf.(Uniform(-2.,4.), β₂))
end

problem_transformation(p::EntryGameProblem) =
    as((β₁ = as(Real,-2,4), β₂ = as(Real,-2,4),
        γ₁ = as(Real,0,2), γ₂ = as(Real,0,2)))

function LogDensityProblems.capabilities(::Type{<:EntryGameProblem})
    LogDensityProblems.LogDensityOrder{1}()
end

LogDensityProblems.dimension(::EntryGameProblem) = 4

function LogDensityProblems.logdensity_and_gradient(problem::EntryGameProblem, x)
    logdens = quasi_likelihood(x)
    grad = gradient_entry_game(x)
    logdens, grad
end

# Instantiate problem and run HMC
p = EntryGameProblem(zeros(2,2))
params = (1., 1., 1., 1.)
p((β₁ = params[1], β₂ = params[2],
   γ₁ = params[3], γ₂ = params[4]))
t = problem_transformation(p)
P = TransformedLogDensity(t, p)
# ∇P = ADgradient(:ForwardDiff, P)
results = mcmc_with_warmup(Random.GLOBAL_RNG, P, 1)

tpapp commented 5 years ago

Thanks, now I see what the problem is.

What you are trying to do does not work since TransformedLogDensity creates a R^n -> R mapping from a callable, which you then differentiate through automatic differentiation (the line you commented out).

You want to do this in the opposite order, which is a valid use case (I sometimes need it too), but at the moment I have not yet implemented this in TransformVariables.jl.

What you can do at the moment is write a function that takes a flat vector, extract the components and transform them manually, and return the log density and the gradient. This should be relatively easy in your case (you just transform to intervals, conceptually the same derivatives).

I will look into an implementation for this in the meantime in TransformVariables.jl.

tpapp commented 5 years ago

Cf https://github.com/tpapp/TransformVariables.jl/issues/36

chenwilliam77 commented 5 years ago

Thanks! This makes sense to me, and I will try to implement your suggestions in the interim.

tpapp commented 2 years ago

Closing this in favor of the issue above.

tpapp / DynamicHMCExamples.jl

Manual derivatives with DynamicHMC #18