ConnorDonegan / Stan-IAR

Spatial intrinsic autoregressive models in Stan
Other
7 stars 1 forks source link

use function to compute per-component ICARs #1

Closed mitzimorris closed 3 years ago

mitzimorris commented 3 years ago

I recently coded up a version of this model which is very similar to this one - submitting here for your consideration. putting the ICAR normal into a function makes the model easier to read. I'm not sure if these two models encode component membership in exactly the same way, but the computation is the same.

functions {
  /**
   * Return the log probability density of the specified vector of
   * coefficients under the ICAR model with unit variance, where
   * adjacency is determined by the adjacency array and the spatial
   * structure is a disconnected graph which has at least one
   * connected component.  The spatial structure is described by
   * an array of component sizes and a corresponding 2-D array
   * where each row contains the indices of the nodes in that
   * component.  The adjacency array contains two parallel arrays
   * of adjacent element indexes (i.e. edges in the component graph).
   *
   * For example, a series of four coefficients phi[1:4] for a
   * disconnected graph containing 1 singleton would have
   * adjacency array {{1, 2}, {2, 3}}, signaling that coefficient 1
   * is adjacent to coefficient 2, and 2 is adjacent to 3,
   * component size array {3, 1}, and (zero-padded) component members
   * array of arrays { { 1, 2, 3, 0}, {4, 0, 0, 0} }.
   *
   * Each connected component has a soft sum-to-zero constraint.
   * Singleton components don't contribute to the ICAR model.
   *
   * @param phi vector of varying effects
   * @param adjacency parallel arrays of indexes of adjacent elements of phi
   * @param comp_size array of sizes of components in spatial structure graph
   * @param comp_members array of arrays of per-components coefficients.
   *
   * @return ICAR log probability density
   *
   * @reject if the the adjacency matrix does not have two rows
   * @reject if size mismatch between comp_size and comp_members
   * @reject if size mismatch between phi and dimension 2 of comp_members
   */
  real standard_icar_disconnected_lpdf(vector phi,
                       int[ , ] adjacency,
                       int[ ] comp_size,
                       int[ , ] comp_members) {
    if (size(adjacency) != 2)
      reject("require 2 rows for adjacency array;",
             " found rows = ", size(adjacency));
    if (size(comp_size) != size(comp_members))
      reject("require ", size(comp_size), " rows for members matrix;",
             " found rows = ", size(comp_members));
    if (size(phi) != dims(comp_members)[2])
      reject("require ", size(phi), " columns for members matrix;",
             " found columns = ", dims(comp_members)[2]);

    real total = 0;
    for (n in 1:size(comp_size)) {
      if (comp_size[n] > 1)
    total += -0.5 * dot_self(phi[adjacency[1, comp_members[n, 1:comp_size[n]]]] -
                 phi[adjacency[2, comp_members[n, 1:comp_size[n]]]])
      + normal_lpdf(sum(phi[comp_members[n, 1:comp_size[n]]]) | 0, 0.001 * comp_size[n]);
    }
    return total;
  }
}
data {
  // spatial structure
  int<lower = 0> I;  // number of nodes
  int<lower = 0> J;  // number of edges
  int<lower = 1, upper = I> edges[2, J];  // node[1, j] adjacent to node[2, j]

  int<lower=0, upper=I> K;  // number of components in spatial graph
  int<lower=0, upper=K> K_size[K];   // component sizes
  int<lower=0, upper=I> K_members[K, I];  // rows contain per-component areal region indices

  vector<lower=0>[K] tau_sp; // per-component scaling factor, 0 for singletons
}
parameters {
  // spatial effects
  real<lower = 0> sigma_sp;  // scale of spatial effects
  real<lower = 0, upper = 1> rho_sp;  // proportion of spatial effect that's spatially smoothed
  vector[I] theta_sp;  // standardized heterogeneous spatial effects
  vector[I] phi_sp;  // standardized spatially smoothed spatial effects

}
transformed parameters {
  vector[I] gamma;
  // each component has its own spatial smoothing
  for (k in 1:K) {
    if (K_size[k] == 1) {
      gamma[K_members[k,1]] = theta_sp[K_members[k,1]];
    } else {
      gamma[K_members[k, 1:K_size[k]]] = 
        (sqrt(1 - rho_sp) * theta_sp[K_members[k, 1:K_size[k]]]
         + (sqrt(rho_sp) * sqrt(1 / tau_sp[k])
        * phi_sp[K_members[k, 1:K_size[k]]]) * sigma_sp);
    }
  }
}
model {
  // spatial hyperpriors and priors
  sigma_sp ~ normal(0, 1);
  rho_sp ~ beta(0.5, 0.5);
  theta_sp ~ normal(0, 1);
  phi_sp ~ standard_icar_disconnected(edges, K_size, K_members);
}
mitzimorris commented 3 years ago

although the model block is now easy to read, the transformed parameters block is not - it's an exercise in understanding how Stan's multi-index experssions work.

ConnorDonegan commented 3 years ago

I like this a lot

ConnorDonegan commented 3 years ago

the transformed parameter block can be a made into a function as well:

 /**
   * Return the convolved local-global partial-pooling term for
   * the BYM2 model. This typically will be entered into the transformed parameters 
   * block, or else the model block.
   *
   * Provide the global term (vector theta) and the spatial term (vector phi) plus index information,
   *  and return their convolution. In the model block, assign std_normal prior to theta and 
   *  icar_normal prior to phi.
   *
   * @param k number of connected graph components
   * @param k_size component sizes
   * @param k_members rows contain per-component areal region indices, zero-padded 
   * @param rho proportion of variance apportioned to the spatial component
   * @param scale scale of the convolved term
   * @param scale_factor scale factor to put the ICAR prior for the spatial
   *        component on unit scale
   * @param theta global partial-pooling term (non-spatial)
   * @param phi local partial-pooling term (spatial)
   * 
   * @return The BYM2 local-global partial-pooling term
   * 
   */
  vector convolve_bym2(
                                   int k,
                                   int n,
                                   int k_size,
                                   int k_members[,],
                                   real rho,
                                   real scale,
                                   vector scale_factor,
                                   vector theta,
                                   vector phi) {
  vector[n] gamma;
  for (k in 1:K) {
    if (k_size[k] == 1) {
      gamma[k_members[k,1]] = theta[K_members[k,1]];
    } else {
      gamma[K_members[k, 1:K_size[k]]] = 
        (sqrt(1 - rho) * theta[K_members[k, 1:K_size[k]]]
         + (sqrt(rho) * sqrt(1 / tau[k])
        * phi[K_members[k, 1:K_size[k]]]) * sigma);
    }
  }
  return gamma
}

and put into a file icar_functions.stan together with standard_icar_disconnected_lpdf.

Then the model can look really clean like this:

functions {
  # include icar_functions.stan
}
data {
  // spatial structure
  int<lower = 0> I;  // number of nodes
  int<lower = 0> J;  // number of edges
  int<lower = 1, upper = I> edges[2, J];  // node[1, j] adjacent to node[2, j]

  int<lower=0, upper=I> K;  // number of components in spatial graph
  int<lower=0, upper=K> K_size[K];   // component sizes
  int<lower=0, upper=I> K_members[K, I];  // rows contain per-component areal region indices

  vector<lower=0>[K] tau_sp; // per-component scaling factor, 0 for singletons
}
parameters {
  // spatial effects
  real<lower = 0> sigma_sp;  // scale of spatial effects
  real<lower = 0, upper = 1> rho_sp;  // proportion of spatial effect that's spatially smoothed
  vector[I] theta_sp;  // standardized heterogeneous spatial effects
  vector[I] phi_sp;  // standardized spatially smoothed spatial effects

}
transformed parameters {
  vector[I] gamma;
  gamma = convolve_bym2(k, I, k_size, k_members, rho_sp, sigma_sp, tau_sp, theta_sp, phi_sp);
}
model {
  // spatial hyperpriors and priors
  sigma_sp ~ normal(0, 1);
  rho_sp ~ beta(0.5, 0.5);
  theta_sp ~ normal(0, 1);
  phi_sp ~ standard_icar_disconnected(edges, K_size, K_members);
}

I haven't tried running this yet, but this an idea at least.

So if I understand this all correctly, the observations don't need to be shuffled at all before sending the data to Stan, and this will work with any relevant graph structure (i.e., it can be fully connected or have multiple parts), right?

Thanks for sharing! I'm going to experiment with it a little and adjust the R functions to match.

mitzimorris commented 3 years ago

correct, don't need to shuffle observations! also, I have some useful R functions for working with spatial data - very proud of this one - assuming folks want to turn disconnected maps into fully connected ones:

library(spdep)

# connect_regions
# Given an nb object and the names of two areal regions, update the nb
# object so that the two regions are connected.

# The nb object is a list of n integer vectors.  It also has attribute
# region.id which is a character vector with n unique values (like the
# row.names of a data.frame object); n is the number of spatial entities.
# Component i of this list contains the integer identifiers its neighbours
# as a sorted vector with no duplication and values in 1:n;  if i has no
# neighbours, the component is a vector of length 1 with value 0L.
# see:  https://cran.r-project.org/web/packages/spdep/vignettes/nb_igraph.html
#
# param nb: nb object over areal regions
# param name1:  name of region 1.
# param name2:  name of region 1.
# returns: updated nb object
connect_regions <- function(nb, name1, name2) {
    if (name1 == name2) {
        cat("Cannot connect region to itself: ", name1)
        return(nb)
    }
    id1 <- which(attr(nb, "region.id") == name1)
    if (length(id1) == 0) {
        cat("Unknown region: ", name1)
        return(nb)
    }
    id2 <- which(attr(nb, "region.id") == name2)
    if (length(id2) == 0) {
        print("Unknown region: ", name2)
        return(nb);
    }
    if (nb[[id1]][1] == 0)  # singleton
        nb[[id1]] <- c(as.integer(id2))
    else
        nb[[id1]] <- unique(sort(c(nb[[id1]], as.integer(id2))))

    if (nb[[id2]][1] == 0)  # singleton
        nb[[id2]] <- c(as.integer(id1))
    else
        nb[[id2]] <- unique(sort(c(nb[[id2]], as.integer(id1))))
    nb
}

scaling factors require R-INLA package, not available from CRAN -

library(Matrix)
library(INLA);
library(spdep)
library(igraph)

# compute geometric mean of a vector
geometric_mean <- function(x) exp(mean(log(x))) 

# compute scaling factor for adjacency matrix
# accounts for differences in spatial connectivity
scaling_factor <- function(adj_matrix) {
    N = dim(adj_matrix)[1]

    # Create ICAR precision matrix  (diag - adjacency): this is singular
    # function Diagonal creates a square matrix with given diagonal
    Q =  Diagonal(N, rowSums(adj_matrix)) - adj_matrix

    # Add a small jitter to the diagonal for numerical stability (optional but recommended)
    Q_pert = Q + Diagonal(N) * max(diag(Q)) * sqrt(.Machine$double.eps)

    # Function inla.qinv provides efficient way to calculate the elements of the
    # the inverse corresponding to the non-zero elements of Q
    Q_inv = inla.qinv(Q_pert, constr=list(A = matrix(1,1,N),e=0))

    # Compute the geometric mean of the variances, which are on the diagonal of Q.inv
    scaling_factor <- geometric_mean(Matrix::diag(Q_inv)) 
    return(scaling_factor) 
}
ConnorDonegan commented 3 years ago

awesome, I did the same thing for myself a couple weeks back but it was to adjust the connectivity matrix directly rather than nb object. I'll put these R functions into a folder here and re-write the introduction for the README.

The most important part I've stolen from your code for the model (the first post above) is how to calculate the convolution term without adding in the terms that I wanted to zero out. I'd like to assign them really tight priors anyways to zero them out, in case someone wants to extract phi_sp for analysis or to map alongside the theta vector.

mitzimorris commented 3 years ago

many thanks! I like your example alot.

ConnorDonegan commented 3 years ago

I put up a couple examples and put the R code into a single file. This took way longer than expected, so I haven't been able to work with your approach to the indexing problem but I hope to try that out soon. As noted in the README I might just update everything later to make use of that method cause it requires fewer data inputs. Thanks again for sharing! Come back any time