gaurav-arya / StochasticAD.jl

Research package for automatic differentiation of programs containing discrete randomness.
MIT License
199 stars 16 forks source link

Differentiating a continuous time Markov chain with respect to the soujourn times #110

Open AndresCenteno opened 9 months ago

AndresCenteno commented 9 months ago

First issue I've done in my whole life Edit: first line of code was not visible because I don't knon Markdown

This code simulates a continuous time Markov chain (CTMC) given some transition probabilities $P$ and a vector of rates of jumps $\theta_i$ with $i=1,\dots,n$, $n$ being the number of nodes or states in the CTMC, meaning that if you are at state $i$ the time it takes to jump out of that state follows an $\text{Exp}(\thetai)$. As you can see the code starts at state $i=2$ and simulates the MC up to time $T=1.5$. The expected value of the vector $\pi$ is $p{ij}(T)$ the probability of being at state $j$ at time $T$ having started at $i$. I want to differentiate that w.r.t. the vector of rates $\theta$, I just don't have a clue on how to write this in the form of the paper

n = 7; P = rand(n,n); P[1:n+1:end] .= 0;
D = zeros(n,n); D[1:n+1:end] .= 1 ./sum(P,dims=2); P = cumsum(D*P,dims=2); # matrix of cumulative probabilities
π = zeros(n); # final distribution

θ = rand(n) # vector of rate of jumps
i = 2; # init state
T = 1.5; # final time
t = -log(1-rand())/θ[i] # first jump

while t < T
    i = argmax(rand().<=P[i,:]) # pick new state
    t += -log(1-rand())/θ[i] # generate random time
end
# the result is i
π[i] += 1 # I want to differentiate this w.r.t θ hihi, so it is an nxn matrix
AndresCenteno commented 9 months ago

Hello this is my attempt to do it, kind of long and doesn't work, it is a MWE as minimal as I could


using LinearAlgebra, StochasticAD, Distributions

# RATE_VEC: diagonal of intensity matrix, parametrize the soujourn times of each node in the Continuous time MC
# Q: intensity matrix
# T: time of simulation forward
# END_VEC: each node has a value, when simulation arrives at time T at random node i, we are to see the value END_VEC[i], we are interested in the expected value of that random variable
# INIT_STATE: X_0 = init_state

# this is just the deterministic counterpart with finite differences
function DET_exp_diffing(RATE_VEC,Q,T,END_VEC,INIT_STATE)
    # Q is normalized matrix so that diagonal is -ones(n,1)
    N = size(Q,1)
    FD_DERIVATIVE = zeros(N)
    SOL = (exp(diagm(RATE_VEC)*Q*T)*END_VEC)[INIT_STATE]
    for i=1:N
        PERT_RATE_VEC = copy(RATE_VEC)
        PERT_RATE_VEC[i] += sqrt(eps(Float64)) # that means perturbed rate vector
        PERT_SOL = (exp(diagm(PERT_RATE_VEC)*Q*T)*END_VEC)[INIT_STATE]
        FD_DERIVATIVE[i] = (PERT_SOL - SOL)/sqrt(eps(Float64))
    end
    return FD_DERIVATIVE
end

# this is monte-carlo of paths, creates OUTER_SIMS random paths in space
function MC_exp_diffing_OUTER(RATE_VEC,P,T,END_VEC,INIT_STATE,OUTER_SIMS,INNER_SIMS)
    return mean(
        [MC_exp_diffing_INNER(RATE_VEC,P,T,END_VEC,INIT_STATE,INNER_SIMS)
        for i=1:OUTER_SIMS
        ])
end

# then for each path we generate the soujourn times INNER_SIMS times
function MC_exp_diffing_INNER(RATE_VEC,P,T,END_VEC,INIT_STATE,INNER_SIMS)
    # P es la matriz de transicion dada por Q
    # primero tienes que crear una STATE_CHAIN
    NUM_STATES = 30
    STATE_CHAIN = create_STATE_CHAIN(P,INIT_STATE,NUM_STATES)
    MC_DERIVATIVE = mean(
        [derivative_estimate(RATE_VEC->rand_walk(STATE_CHAIN,RATE_VEC,END_VEC,T), RATE_VEC)
        for i=1:INNER_SIMS
        ])
    return MC_DERIVATIVE
end

function create_STATE_CHAIN(P,INIT_STATE,NUM_STATES)
    STATE_CHAIN = zeros(Int64,NUM_STATES)
    STATE_CHAIN[1] = INIT_STATE
    for STATE=2:NUM_STATES
        STATE_CHAIN[STATE] = argmax(rand().<=P[STATE_CHAIN[STATE-1],:])
    end
    return STATE_CHAIN
end

function rand_walk(STATE_CHAIN,RATE_VEC,END_VEC,T)
    p = 1-exp(-T*RATE_VEC[STATE_CHAIN[1]]) # probability of jump
    B = rand(Bernoulli(p))+1 # 1 is no jump, 2 is jump
    # rng for exponential conditioned to jump before T
    JUMP = -log(1-rand()*(1-exp(-RATE_VEC[STATE_CHAIN[1]]*T)))/RATE_VEC[STATE_CHAIN[1]]
    # following vector should be fully computed if there has been a jump
    aux_bool = copy(B.value) # does not support this, need to somehow copy
    if aux_bool == 2
        aux = [END_VEC[STATE_CHAIN[1]]; rand_walk(STATE_CHAIN[2:end],RATE_VEC,END_VEC,T-JUMP)]
    else
        aux = [END_VEC[STATE_CHAIN[1]]; false]
    end
    return aux[B]
end

function create_random(n)
    Q = rand(n,n)
    θ = rand(n)
    Q[1:n+1:end] .= 0
    P = copy(Q)
    for i=1:n
        Q[i,:] /= sum(Q[i,:])
        Q[i,i] = -sum(Q[i,:])
        P[i,:] = cumsum(P[i,:]./sum(P[i,:]))
    end
    return P, Q, θ
end

n = 11
P, Q, RATE_VEC = create_random(n)
END_VEC = randn(n)

DET_DER = DET_exp_diffing(RATE_VEC,Q,1,END_VEC,2)
MC_DER = MC_exp_diffing_OUTER(RATE_VEC,P,1,END_VEC,2,1000000,100)

@show DET_DER
@show MC_DER
@show maximum(abs.(DET_DER-MC_DER))
gaurav-arya commented 8 months ago

Hey! First, would you mind reporting the standard error of your Monte Carlo measurements along with the means? (This should equal to the standard deviation of the samples divided by the square-root of the number of samples, which looks to be INNER_SIMS * OUTER_SIMS in your code.) This can help us figure out if the issue is due to high variance or bias in the estimate.

gaurav-arya commented 8 months ago
function rand_walk(STATE_CHAIN,RATE_VEC,END_VEC,T)
    p = 1-exp(-T*RATE_VEC[STATE_CHAIN[1]]) # probability of jump
    B = rand(Bernoulli(p))+1 # 1 is no jump, 2 is jump
    # rng for exponential conditioned to jump before T
    JUMP = -log(1-rand()*(1-exp(-RATE_VEC[STATE_CHAIN[1]]*T)))/RATE_VEC[STATE_CHAIN[1]]
    # following vector should be fully computed if there has been a jump
    aux_bool = copy(B.value) # does not support this, need to somehow copy
    if aux_bool == 2
        aux = [END_VEC[STATE_CHAIN[1]]; rand_walk(STATE_CHAIN[2:end],RATE_VEC,END_VEC,T-JUMP)]
    else
        aux = [END_VEC[STATE_CHAIN[1]]; false]
    end
    return aux[B]
end

Directly accessing the value property of the stochastic triple would indeed introduce bias. Would you mind providing the most natural to write version of this function without worrying about being StochasticAD compatible, so I can understand what you are trying to do here? I can then help you write it in a StochasticAD compatible way.

AndresCenteno commented 7 months ago

This is a MWE of what I want to do, without trying to write it StochasticAD compatible, in this random seed you can see that the estimator is likely unbiased, sorry for answering so late, I was trying to do SPA and Malliavin weights unsuccesfully (ㆆ _ ㆆ), if you were to solve this my mental health would go up 9000 points probably EDIT: I actually did it the next day with Malliavin weights just needed to wake up, still would be interesting to know how to do it with SPA, I guess Automatic Differentiation should be kind of similar

using Random, LinearAlgebra, Distributions
Random.seed!(2024)
# number of states
n = 10
# infinitesimal generator
Q = rand(n,n); Q[1:n+1:end] .= 0; for i=1:n; Q[i,:] ./= sum(Q[i,:]); Q; end; Q[1:n+1:end] .= -1
# embedded Markov chain
P = copy(Q); P[1:n+1:end] .= 0; P = cumsum(P,dims=2)
# rewards
D = -rand(n)
# rates
θ = rand(n)
# final reward
u0 = randn(n)
# initial state
init_state = rand(1:n)
# time
T = 0.15

# deterministic solution
solution = (exp(diagm(θ)*(diagm(D)+Q)*T)*u0)[init_state] # 1.0582327608691373

# stochastic solution
nsims = Int(1e8)
samples = zeros(nsims)

for sim=1:nsims
    i = copy(init_state); t = 0 # reset simulation
    τ = rand(Exponential(1/θ[i])); t += τ
    η = exp(θ[i]*D[i]*τ)
    while t < T
        i = argmax(P[i,:].>rand())
        τ = rand(Exponential(1/θ[i])); t += τ
        η *= exp(θ[i]*D[i]*τ)
    end
    η /= exp(θ[i]*D[i]*(t-T)) # subtract reward that was cutted short by end of simulation
    samples[sim] = η*u0[i]
end

sample_mean = mean(samples) # 1.0582814906302531
sample_sem = std(samples)/sqrt(nsims) # 4.6125383852028576e-5
println(abs(sample_mean - solution) < 1.96*sample_sem ? "we are so back" : "I should quit the phd") # we are so back in Random.seed!(2024)

# I would like to compute this, of course use other scheme for more precision haha
dsoldθ = zeros(n)
for i=1:n
    θpert = copy(θ); θpert[i] += sqrt(eps(Float64))
    solpert = (exp(diagm(θpert)*(diagm(D)+Q)*T)*u0)[init_state]
    dsoldθ[i] = (solpert-solution)/sqrt(eps(Float64))
end
gaurav-arya commented 5 months ago

The reason StochasticAD fails is because the t contains a dual component which is dropped in the t < T comparison. (See the fifth bullet point of https://gaurav-arya.github.io/StochasticAD.jl/stable/limitations.html)

One simple way of working around this is to supply a rate bound for the exponential distribution that is independent of the current state i, and then possibly "do nothing" to account for this rate being larger than the true rate. I've implemented the solution for you in the below gist!

https://gist.github.com/gaurav-arya/e98dde330be1902c3cbb4bbecd8b16b5