paul-buerkner / brms

brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
https://paul-buerkner.github.io/brms/
GNU General Public License v2.0
1.29k stars 187 forks source link

Spatially varying coefficients #737

Open JeremyGelb opened 5 years ago

JeremyGelb commented 5 years ago

Dear Paul, brms propose many methods to deal with spatially autocorrelated data. Maybe a new path to investigate could be the spatial variation of relationships between dependent variables and predictors. A probably easy start point could be to allow coefficients to vary spatially, ie to have a spatially correlated distribution. An implementation in STAN is proposed here (bottom of the page) Because it can be seen as a smoother, the syntax in brms could be :

Y ~ pred1 + sv(pred2, w=W)

Where pred2 in a spatially varying coefficient and W a neighborhood matrix A comparison can be done with classical random effect models including a random intercept and random slopes. In that case, the classical CAR is a model with a spatially varying intercept, that could be completed by spatially varying coefficients. These coefficients would follow a multivariate normal distribution centered on 0. That way, it could be possible to separate the global mean effect of a predictor (fixed effect) from its spatial variation (random effect). In that case, an other syntax using the random effect specification in brms could be used. A link might be made with the issue #708 proposing to include autocorrelation structure in formula. y ~ pred1 + pred2 + car(~1| OID, w=W) # spatially varying intercept only y ~ pred1 + pred2 + car(~1+Pred2 | OID, w=W) # spatially varying intercept only and slope

I found this paper interesting about spatially varying coefficients I hope that this feature will sound interesting ! Thank you again for the amazing work you do with brms !

paul-buerkner commented 5 years ago

This is an interesting proposal but I am a bit puzzled about why they are calling it a CAR model. At least, when I am following the link to the Stan implementation, it seems very much like an SAR model and not like a CAR model but I might be mistaken.

The paper seems to in fact talk about some CAR-like models but they didn't seem to discuss the math behind the model in detail (or I didn't read closely enough which is very well possible). So before I consider anything for implementation, I need to understand how the model exactly looks like and this is not yet clear to me.

@mitzimorris do you have any experience or thoughts on these kinds of spatially varying coefficients models?

mitzimorris commented 5 years ago

I think Paul is correct. the SAR/CAR distinctions are in Cressie 1993 - I will take a look at it and summarize.

JeremyGelb commented 5 years ago

Well, sorry for the confusion between SAR and CAR. It is true that the STAN implementation I linked is based the SAR model which come from econometric literature. I have the feeling that in Bayesian modeling, the spatial structures are traditionally added in a hierarchical way (like BYM model) and seen as random effects that are spatially structured. I found this approach more intuitive and more interpretable.

two good readings : Quantifying geographic variations in associations between alcohol distribution and violence: a comparison of geographically weighted regression and spatially varying coefficient models (section 4.2) and Space varying coefficient models for small area data (section 3) Again, thank you for your consideration !

JeremyGelb commented 5 years ago

A little information : I just found that spaMM (R-package) implements “autocorrelated random-coefficient” models with a syntax close to lmer spirit (and thus brms). It is a frequentist approach, but the documentation is good.

JeremyGelb commented 4 years ago

Hello!

I don't know if this topic is dead or not, but I would like to provide two supplementary ressources:

mitzimorris commented 4 years ago

hi Jeremy,

this is extremely relevant! struggling to help epidemiologists who need spatio-temporal analysis of covid prevalence. will look at both papers right now - many thanks!

jsocolar commented 1 year ago

Just want to note that spatially varying coefficient models are already possible in brms via nonlinear formulae. Here's an example using a Gaussian Process (because that's more intuitive for me to simulate from), but I'm pretty sure we can implement autoregressive coefficient priors in the same way.

library(brms)
library(dplyr)

n <- 100 # sample size
lscale <- 0.5 # square root of l of the gaussian kernel
sigma_gp <- 1 # sigma of the gaussian kernel
sigma_resid <- .5 # residual sd
intercept <- 1 # model intercept

# covariate data for the model
gp_data <- data.frame(
  x = rnorm(n), 
  y = rnorm(n),
  covariate = rnorm(n)
  )

# get distance matrix
dist.mat <- stats::dist(gp_data[,c("x", "y")]) |> as.matrix()

# get covariance matrix
cov.mat <- sigma_gp^2 * exp(- (dist.mat^2)/(2*lscale^2))
cov.mat[1:5, 1:5]

# simulate response data
gp_data <- gp_data |>
  mutate(
    coef = mgcv::rmvn(1, rep(0, n), cov.mat),
    lp = intercept + coef * covariate,
    resp = rnorm(n, lp, sigma_resid)
  )
head(gp_data)

# fit model
svc_mod <- brm(
  bf(
    resp ~ int + g * covariate,
    int ~ 1,
    g ~ 0 + gp(x, y, scale = FALSE),
    nl = TRUE
  ),
  data = gp_data,
  cores = 4,
  backend = "cmdstanr"
)

summary(svc_mod)
jsocolar commented 1 year ago

Sorry, I might have spoken too soon about the extensibility to autoregressive structures. For me, the following

# make a symmetric adjacency matrix
M <- matrix(data = sample(c(0,1), n^2, replace = TRUE), nrow = n)
M <- M * t(M)
diag(M) <- 0

svc_mod_car <- brm(
  bf(
    resp ~ int + g * covariate,
    int ~ 1,
    g ~ 0 + car(M = M),
    nl = TRUE
  ),
  data = gp_data,
  data2 = list(M = M),
  cores = 4,
  backend = "cmdstanr"
)

yields the somewhat inscrutable error

Error in data_ac(x, data, data2 = data2, basis = basis$ac) : 
  no slot of name "i" for this object of class "dtrMatrix"
paul-buerkner commented 1 year ago

Currently, brms is a bit inflexible when it comes to adding autoregressive terms in non-linear models. I aim to change this (where possible) in brms 3.0 though.