prmiles / pymcmcstat

Python implementation of MATLAB toolbox "mcmcstat"
https://github.com/prmiles/pymcmcstat/wiki
MIT License
70 stars 10 forks source link

setting burnintime results in error when run with dram method #95

Open mikepab opened 3 years ago

mikepab commented 3 years ago

Describe the bug Providing a burnintime parameter in mcstat.simulation_options.define_simulation_options() results in an error when also using the DRAM method. Specifically this occurs because DelayedRejection.py attempts to index RDR, which is left as None during the burn-in period.

To Reproduce I can reproduce this by adding a burnintime argument to one of the examples

import numpy as np
import scipy.optimize
from pymcmcstat.MCMC import MCMC
import matplotlib.pyplot as plt
import pymcmcstat
print(pymcmcstat.__version__)

# Initialize MCMC object
mcstat = MCMC()
# Next, create a data structure for the observations and control
# variables. Typically one could make a structure |data| that
# contains fields |xdata| and |ydata|.
ndp = 7
x = np.array([28, 55, 83, 110, 138, 225, 375])  # (mg / L COD)
x = x.reshape(ndp, 1) # enforce column vector
y = np.array([0.053, 0.060, 0.112, 0.105, 0.099, 0.122, 0.125]) # (1 / h)
y = y.reshape(ndp, 1) # enforce column vector
# data structure
mcstat.data.add_data_set(x,y)

def modelfun(x, theta):
    return theta[0]*x/(theta[1] + x)

def ssfun(theta,data):
    res = data.ydata[0] - modelfun(data.xdata[0], theta)
    return (res ** 2).sum(axis=0)

def residuals(p, x, y):
    return y - modelfun(x, p)

theta0, ssmin = scipy.optimize.leastsq(
    residuals,
    x0=[0.15, 100],
    args=(mcstat.data.xdata[0].reshape(ndp,),
          mcstat.data.ydata[0].reshape(ndp,)))
n = mcstat.data.n[0] # number of data points in model
p = len(theta0); # number of model parameters (dof)
ssmin = ssfun(theta0, mcstat.data) # calculate the sum-of-squares error
mse = ssmin/(n-p) # estimate for the error variance

J = np.array([[x/(theta0[1]+x)], [-theta0[0]*x/(theta0[1]+x)**2]])
J = J.transpose()
J = J.reshape(n,p)
tcov = np.linalg.inv(np.dot(J.transpose(), J))*mse
print('tcov = {}'.format(tcov))

mcstat.parameters.add_model_parameter(
    name='$\mu_{max}$',
    theta0=theta0[0],
    minimum=0)
mcstat.parameters.add_model_parameter(
    name='$K_x$',
    theta0=theta0[1],
    minimum=0)

mcstat.simulation_options.define_simulation_options(
    nsimu=5.0e3,
    updatesigma=1,
    qcov=tcov,
    burnintime=1e4)
mcstat.model_settings.define_model_settings(
    sos_function=ssfun,
    sigma2=0.01**2)

mcstat.run_simulation()

Expected behavior In the MATLAB implementation of mcmcstat, it appears that if no RDR is set (as in the burnintime), the DR strategy is used instead. I'd expect either the same approach to be used in pymcmcstat (adding a RDR is None check followed by the scaling approach pasted below) or for an error to be raised informing the user that the burnintime parameter cannot be set with DRAM.

Relevant code snippet (lines 357-374 from mcmcrun.m in mcmcstat.

if dodram
  RDR = getpar(options,'RDR',[]); % RDR qiven in ooptions
  if ~isempty(RDR)
    for i=1:Ntry
      invR{i} = RDR{i}\eye(npar);
    end
    R = RDR{1};
  else
    % DR strategy: just scale R's down by DR_scale
    RDR{1} = R;
    invR{1} = R\eye(npar);
    for i=2:Ntry
      RDR{i}  = RDR{i-1}./DR_scale(min(i-1,length(DR_scale)));
      invR{i} = RDR{i}\eye(npar);
    end
  end
  iacce = zeros(1,Ntry);
end