kingaa / pomp

R package for statistical inference using partially observed Markov processes
https://kingaa.github.io/pomp
GNU General Public License v3.0
110 stars 26 forks source link

How to simulate the time-varying parameter model by pomp package? #164

Closed Curley-l closed 2 years ago

Curley-l commented 2 years ago

I refer to the pdf of "INTRODUCTION TO POMP BY EXAMPLE" to simulate a time-varying parameter model with two explanatory variables, but the "simulate“ function doesn't work.

The state equation is:
alpha_t = alpha_t-1 + sigma_1, sigma_1~N(0,1)
beta1_t = beta1_t-1 + sigma_2, sigma_2~N(0,1)
beta2_t = beta2_t-1 + sigma_3, sigma_3~N(0,1)

The measurement equation is: y = alpha_t +x1_tbeta1_t + x2_t beta2_t + tau_t, tau_t ~N(0,1)

a) simulate the explanatory variables x1 and x2 and I(1).

n <- 2
T <- 100 
alpha <- matrix(0,T,1)
beta <- matrix(0,T,n)
x <- matrix(0,T,n)
x[1,] <- c(0.2, 0.3) # The initial values of the X0

b) generate x: xt ~ AR(1)

nn=1
for(nn in 1:n){
  for(t in 2:T){
    x[t,nn] <- x[t-1,nn]*x[1,nn] + sqrt(x[1,nn])*rnorm(1, mean = 0, sd = 1)
  }
}
x <- cbind(rep(1, T), x)

c) generate beta:the information content of the indicators for yt goes down over time give normalized weights use the rprocess to simulate the state equation:

tvp.rprocess <- function(a,b,c, times, params, delta.t, ...) {
  coeff0 <- rbind(a,b,c)
  nco <- nrow(coeff0)
  nsims <- ncol(coeff0)
  ntimes <- length(times)
  G <- array(diag(nco), dim = c(nco, nco, nsims)) 
  H <- array(diag(nco), dim = c(nco, nco, nsims))
  # sigma[1, 2, ] <- 0
  coeff <- array(0, dim = c(nco, nsims, ntimes))
  rownames(coeff) <- rownames(coeff0)
  coeff[, , 1] <- coeff0
  for (k in 1:nsims) {
    for (j in 2:ntimes) {
      coeff[, k, j] <- G[, , k] %*% coeff[, k, j - delta.t] + H[, , k] %*% rnorm(3, 0, params["sigma"])
    }
  }
  coeff
}

tvp.dprocess <- function(coeff, times, params, log = FALSE, ...) {
  nsims <- ncol(coeff)
  ntimes <- length(times)
  G <- array(diag(nco), dim = c(nco, nco, nsims))
  H <- array(diag(nco), dim = c(nco, nco, nsims))
  d <- array(0, dim = c(nsims, ntimes - 1)) # nsims*(ntimes-1)
  for (k in 1:nsims) {
    for (j in 2:ntimes) {
      z <- forwardsolve(H[, , k], coeff[, k, j] - G[, , k] %*% coeff[, k, j - 1])
      log=0
      if (log) {
        d[k, j - 1] <- sum(dnorm(z, mean = 0, sd = 1, log = TRUE))
      } else { 
        d[k, j - 1] <- exp(sum(dnorm(z, mean = 0, sd = 1, log = TRUE))) 
      }                         
    }
  }
  d
}

f.rmeasure <- function(x, coeff, times, params,...) {
  nco <- dim(coeff)[1]
  nsims <- dim(coeff)[2]
  ntimes <- dim(coeff)[3]
  xx <- array(t(x), dim = c(nco, nsims, ntimes))
  y <- array(0, dim = c(1, 1, ntimes))
  x_new <- array(0, dim = c(1, 1, ntimes))
  rownames(y) <- c("y"); rownames(x_new) <- c("x")
  for (k in 1:nsims) {
    for (j in 1:ntimes) {
      x_new[, k, j] <- xx[, k, j] %*% coeff[ , , 1]
      y[, k, j] <- x_new[, k, j] + rnorm(1, mean = 0, sd = params["tau"])
    }
  }
  x_new
  y
}

f.dmeasure <- function(y, x_new, coeff, times, params, log = FALSE, ...) {
  d <- dnorm(x = y["y", ,], mean = x_new["x", , ], sd = params["tau"], log = TRUE)
  log=0
  if (log) {
    sum(d, na.rm = T)
  } else {
    exp(sum(d, na.rm = T))
  }
}

pomp.f.init <- function (t0, params, ...) {
  setNames(params["a_0"], "a")
  setNames(params["b_0"], "b")
  setNames(params["c_0"], "c")
}

pomp.f <- pomp(data = data.frame(time = 1:100, Y = NA), times = "time", t0 = 0,
               rprocess = discrete_time(step.fun = tvp.rprocess, delta.t = 1),
               rmeasure = f.rmeasure,
               dmeasure = f.dmeasure,
               rinit = pomp.f.init,
               paramnames = c("sigma", "tau"),
               params = c(sigma = 1, tau = 1, a_0 = 0.01, b_0 = 0.01, c_0 = 0.01))

but the results show that Error: in 'simulate': argument "params" is missing, with no default" I find it's because the difference version of pomp package.

But according to the newest version of pomp package, how can I set the TVP model? it is to use the ”covar“ to represent the explanatory variables "x"?

kingaa commented 2 years ago

See FAQ 1.4.

The interwebs are full of old and outdated information. A more reliable source for information regarding pomp is the pomp website. In particular, the documentation page.

I am not sure what you mean by "the TVP model", but one can certainly incorporate time-varying covariates into a model using the covar argument. This is documented here and here. An example is given here.

kingaa commented 2 years ago

@justly129: Might you be looking for something as simple as the following?

library(pomp)

simulate(
  t0=0, times=1:100,
  rprocess=discrete_time(
    Csnippet("
     alpha = rnorm(alpha,1);
     beta1 = rnorm(beta1,1);
     beta2 = rnorm(beta2,1);"),
    delta.t=1
  ),
  rinit=Csnippet("alpha = beta1 = beta2 = 0;"),
  rmeasure=Csnippet("
     y = rnorm(alpha+x1*beta1+x2*beta2,1);"
  ),
  statenames=c("alpha","beta1","beta2"),
  obsnames="y",
  covar=covariate_table(
    t=seq(0,100),
    x1=0.1*t+rnorm(n=101),
    x2=-0.1*t+rnorm(n=101),
    times="t"
  )
) -> tvp

tvp |> plot()

library(tidyverse)

tvp |>
  as.data.frame() |>
  gather(var,val,-time) |>
  ggplot(aes(x=time,y=val))+
  geom_line()+
  facet_grid(var~.,scales="free_y")+
  theme_bw()

I don't have any data or covariates, so I simply fabricated them for illustration purposes.

Curley-l commented 2 years ago

Hi there,

I am new to the pomp package and to C, potentially deadly combinations. And I would like to ask a question related to the covariables in the pomp.

The following is the model formula I would like to build through the pomp package: image

The code of the model is here:

library(pomp)
library(tidyverse)
library(dlm)
library(midasr)

nseed <- 20201027 ;
set.seed(nseed) ; 

## (1) data generate process
## a) set x matrix: x_t=c*x_t-1 + sqrt(c)*rnorm(1)
n <- 2
T <- 300
m <- 3
x <- matrix(NA, 3*T+9, n)
x[1,] <- rep(0.1,n)
co.x <- c(0.2, 0.3) 

nn=1
for(nn in 1:n){
  for(t in 2:nrow(x)){
    x[t,nn] <- co.x[nn]*x[t-1,nn] + sqrt(co.x[nn])*rnorm(1)
  }
}

## mixed frequency 
l <- 4
X <- matrix(NA, nrow(x)/m, n*l)
for (nn in 1:n){
  X[,(l*(nn-1)+1):(l*nn)] <- fmls(x[,nn], (l-1), m)
}

rebun <- nrow(X)-T-1
X <- as.matrix(X[-1:-rebun,])

## b) simulate the state value and meansurement value
tvp <- simulate(t0=0, times=1:T,
  params=c(sigma1=1, sigma2=1, sigma3=1, sigma4=1, tau=1, l=5),
  paramnames=c("sigma1","sigma2","sigma3","sigma4","tau","l"),
  rprocess=discrete_time(Csnippet("
    alpha = rnorm(alpha,sigma1);
    beta1 = rnorm(beta1,sigma2);
    theta1 = rnorm(theta1,sigma3);
    theta2 = rnorm(theta2,sigma4);"),delta.t=1),
  rinit=Csnippet("alpha = beta1 = 0.0001; theta1=1; theta2=-0.5;"),
  rmeasure=Csnippet("
    int lags = (int) l;
    double theta_1, theta_2, x_1[5];

    theta_1=theta1;
    theta_2=theta2;

    int i, j, k;

    // calculate the weight of the covariate x1 at lag 0: lags:
    double sum_weights=0;
    double weight[5];
    double weightall[5];

    for (i = 1; i < lags; i++)
      weight[i] = exp(theta_1 * pow(i,1) + theta_2 * pow(i,2));
    sum_weights += weight[i];

    // the sum of weights and weights*x1:
    double results=0;
    //doubel *x_1=&x1;
    for (j = 1; j < lags; j++)
      weightall[j] = weight[j]/sum_weights; 
    results += weightall[j] * x_1[j];

    // the measurement equation:
    y = alpha + beta1 * results;"
  ),
  statenames=c("alpha","beta1","theta1", "theta2"),
  obsnames="y",
  covar=covariate_table(t=seq(0,T), x1 = as.matrix(X[,1:l]), times="t")
)

But I found that I failed to successfully input the value of x1 contained in the “covar” into the Csnippet code in “rmeasure", so the result of the simulation is NAN. So I would like to ask how to introduce x1 contained in "covar" into "measure" and multiply it with weightall to get the result?

Looking forward to your reply! Ying

Curley-l commented 2 years ago

Refer to the question from https://github.com/kingaa/pomp/issues/102 I get the simulated data from the following code, by adding the " #define covar(K) (covars[covindex[K]])" to introduce the covariable into ”rmeasure“ Csnippet. I am not sure if this is correct~~


tvp <- simulate(
  t0=0, times=1:T,
  params=c(sigma1=1, sigma2=1, sigma3=1, sigma4=1, tau=1, l=4),
  paramnames=c("sigma1","sigma2","sigma3","sigma4","tau","l"),
  rprocess=discrete_time(Csnippet("
    alpha = rnorm(alpha,sigma1);
    beta1 = rnorm(beta1,sigma2);
    theta1 = rnorm(theta1,sigma3);
    theta2 = rnorm(theta2,sigma4);"),delta.t=1),
  rinit=Csnippet("alpha = beta1 = 0.0001; theta1=1; theta2=-0.5;"),
  rmeasure=Csnippet("
    int lags = (int) l + 1;
    double theta_1, theta_2;//, x_1[5];
    #define covar(K) (__covars[__covindex[K]])
    //double x_1=covar;

    theta_1=theta1;
    theta_2=theta2;
    //lags=l;

    int i, j, k;

    // calculate the weight of the covariate x1 at lag 0: lags:
    double sum_weights=0;
    double weight[5];
    double weightall[5];

    for (i = 1; i < lags; i++)
        weight[i] = exp(theta_1 * pow(i,1) + theta_2 * pow(i,2));
    sum_weights += weight[i];

    // the sum of weights and weights*x1:
    double results=0;
    for (j = 1; j < lags; j++)
      weightall[j] = weight[j]/sum_weights; 
    results+=weightall[j] * covar(j);

    // the measurement equation:
    y = alpha + beta1 * results;"
  ),
  statenames=c("alpha","beta1","theta1", "theta2"),
  obsnames="y",
  covar=covariate_table(t=seq(0,T), x1 = as.matrix(X[,1:4]), times="t")
)

But I still have a question that, the problem in this link (https://github.com/kingaa/pomp/issues/102) has K variables in the "cover" function. I would like to ask if I also want to introduce K covariables, and each covariable contains four elements, how to introduce them in Csnippet?
Means suppose I have K variables, and each variable is a 1*4 dimensional vector.

kingaa commented 2 years ago

I'm afraid I don't quite understand your question. In the formulae you posted above, I see an x_t, but I do not see x1. What is x1?

When you use covar = covariate_table(...), the covariates defined there must be named and will be available, by name, in all the C snippets (except those for the rprior, dprior, and partrans basic components). Best practice is to specify covariate_table using a data frame. In the codes above, when you give

covariate_table(t=seq(0,T), x1 = as.matrix(X[,1:4]), times="t")

The names of the resulting covariates will be taken from the column-names of X.

kingaa commented 2 years ago

Also, I note that your code might not be doing what you think it is: consider the lines:

    for (i = 1; i < lags; i++)
        weight[i] = exp(theta_1 * pow(i,1) + theta_2 * pow(i,2));
    sum_weights += weight[i];

and

    for (j = 1; j < lags; j++)
      weightall[j] = weight[j]/sum_weights; 
    results+=weightall[j] * covar(j);

Are you missing brackets ({}) in your for-loops?

Curley-l commented 2 years ago

Thank you for your prompt reply! I use x1 in the code to represent xt in the formula.

covariate_table(t=seq(0,T), x1 = as.matrix(X[,1:4]), times="t")

The names of the resulting covariates will be taken from the column-names of X."

This means that if I name the covariate x1, I don’t need to define the covar when I write code using Csnippet in rmeasure? Can it be quoted directly?

kingaa commented 2 years ago

But it looks like x_t is a vector, correct?

kingaa commented 2 years ago

The code may run, but does it run correctly? In the third line of

    for (i = 1; i < lags; i++)
      weight[i] = exp(theta_1 * pow(i,1) + theta_2 * pow(i,2));
    sum_weights += weight[i];

i is always equal to lags. In C, whitespace is not syntactic.

Curley-l commented 2 years ago

Yep, the x_t is a vector. For example, x_t is a 1*4-dimensional vector, I need to first calculate the weight of each element in x_t through the following equation: image

And then multiply the corresponding weight by x_t to get a 1*1 dimensional result. Finally, at every moment t, a result corresponds to a beta.

So I write the code as follows:

  rmeasure=Csnippet("
    int lags = (int) l + 1;
    double theta_1, theta_2;//, x_1[5];
    #define covar(K) (__covars[__covindex[K]])
    //double x_1=covar;

    theta_1=theta1;
    theta_2=theta2;
    //lags=l;

    int i, j, k;

    // calculate the weight of the covariate x1 at lag 0: lags:
    double sum_weights=0;
    double weight[5];
    double weightall[5];

    for (i = 1; i < lags; i++)
        weight[i] = exp(theta_1 * pow(i,1) + theta_2 * pow(i,2));
    sum_weights += weight[i];

    // the sum of weights and weights*x1:
    double results=0;
    for (j = 1; j < lags; j++)
      weightall[j] = weight[j]/sum_weights; 
    results+=weightall[j] * covar(j);

    // the measurement equation:
    y = alpha + beta1 * results;"
  ),
  statenames=c("alpha","beta1","theta1", "theta2"),
  obsnames="y",
  covar=covariate_table(t=seq(0,T), x1 = as.matrix(X[,1:4]), times="t"))

But I am not sure whether I have correctly introduced x1 in covar into rmeasure.

kingaa commented 2 years ago

Yep, the x_t is a vector. For example, x_t is a 1*4-dimensional vector, I need to first calculate the weight of each element in x_t through the following equation: image

And then multiply the corresponding weight by x_t to get a 1*1 dimensional result. Finally, at every moment t, a result corresponds to a beta.

We can make this easy, or we can make it general. Since you have a 4-vector to multiply, I suggest just writing out the sum, which will have four terms. In each term, you can refer to the component of your x-vector by name.

However, I am still confused. The thing you call a "weight for each element in x_t" is strange. In particular, there is no dependence on j in the summand. My advice to you is to get the mathematics correct first, and translate it into pomp afterward.

Curley-l commented 2 years ago

Actually the j in the code is equal to i. I just use a different notation.

And the theta1 and theta2 in the follow formula are subject to a random process, so I need to write it into the rmeasure function. I don't know wether I understand correctly.

image

kingaa commented 2 years ago

I see. If your intention is to communicate, then it seems you would want to use a notation that others can understand. Not only does the ratio in Eq. 1 not make any sense, it is not at all clear that x is a vector. Again, I suggest you get the mathematics right before you go any farther.

Curley-l commented 2 years ago

I see. If your intention is to communicate, then it seems you would want to use a notation that others can understand. Not only does the ratio in Eq. 1 not make any sense, it is not at all clear that x is a vector. Again, I suggest you get the mathematics right before you go any farther.

Thank you for your time and reply. I will get the mathematics right and then translate it into pomp correctly.