paul-buerkner / brms

brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
https://paul-buerkner.github.io/brms/
GNU General Public License v2.0
1.25k stars 177 forks source link

Spatial correlation structures #6

Closed paul-buerkner closed 6 years ago

paul-buerkner commented 8 years ago

We should think about implementing spatial correlations similar to the correlation structures that can be modeled with nlme.

paul-buerkner commented 7 years ago

One of the most relevant spatial correlation structures seems to be the conditional autoregressive (CAR) structure mentioned in #128.

ssridhara commented 7 years ago

Hi Paul,

I'm trying to implement a glmm that also models spatial auto correlation. Based on the messages above, do I take it to mean that I cannot model spatial correlation yet? Is the autocor argument not meant for spatial structuring?

My specific interest is to model count data as a function of several environmental variables using the negative binomial family. Because counts were sampled in spatially contiguous units, residuals are spatially correlated when using a glmm. But no package seems to currently allow incorporating spatial correlation structure while simultaneously using negative binomial family and estimating random effects.

Your feedback on this would be very helpful. Sachin

Great work on the package BTW! Very useful to setup bayesian models without having to learn the various minor details of STAN.

paul-buerkner commented 7 years ago

Currently, the autocor argument only supports (unidimensional) autocorrelation structures of time. I am afraid that spatial autocorrelation structures are not yet implemented, but this will definitely come in the future. I am not sure, however, what kind of spatial autocorrelation can be implemented for negative binomial models. Do you have a reference in mind?

ssridhara commented 7 years ago

I have not come across any implementation of spatial autocorrelation when using negative binomial (nb) distributions. I'm not aware of the statistical technicalities, but am I right in thinking there is a conceptual hurdle to having such as model? Is that why no package allows spatial nb models? After several hours of searching articles that may have modeled SAR using NB models, I could not find a single one....

When I noticed autocor in brms, I thought may be this is a way out! The option of using poisson distribution does not apply to my data set because it is heavily overdispersed..

Thanks for quick response.

paul-buerkner commented 7 years ago

Hmm, there is definitely a computational hurdle since a spatially correlated negative binomial model will have a complex parameter structure, which is likely problematic for maximum-likelhood based approaches. I am not really an expert in spatial correlation structures so I am not sure which of these structures can be reasonably generalized from the canonical normal distribution to count distributions. The same goes for correlation structures of time.

As soon as I have time, I will dig deeper into these models (I am sure someone has posted a question on the Stan users list about this before) and find out what's the best way of implementing it.

jgabry commented 7 years ago

Negative binomial models can be nasty computationally, even in a more standard framework like glm.nb and glmer.nb. Stan can handle those models more easily but you still have to keep an eye out for numerical issues. In theory you can certainly add spatial correlations for parameters by transforming so that you can use a multivariate Gaussian prior with a precision/covariance matrix with the appropriate spatial structure. But that will certainly add to the computational challenge, especially with a negative binomial data model. I've had success with this sort of thing with beta-binomial data models, but I haven't personally tried the negative binomial.

On Saturday, October 8, 2016, Paul-Christian Bürkner < notifications@github.com> wrote:

Hmm, there is definitely a computational hurdle since a spatially correlated negative binomial model will have a complex parameter structure, which is likely problematic for maximum-likelhood based approaches. I am not really an expert in spatial correlation structures so I am not sure which of these structures can be reasonably generalized from the canonical normal distribution to count distributions. The same goes for correlation structures of time.

As soon as I have time, I will dig deeper into these models (I am sure someone has posted a question on the Stan users list about this before) and find out what's the best way of implementing it.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/6#issuecomment-252450215, or mute the thread https://github.com/notifications/unsubscribe-auth/AHb4QwxnQpEFnnAkdl0bVaGRU5Yd7chPks5qyA8TgaJpZM4GLrEi .

ssridhara commented 7 years ago

Thanks for the clarification. Any references to the beta-binomial models you talk about? I'm not really sure whether this is extensible to negative binomial distributions or even entirely relevant to SAR, but it demonstrates the use of K-Function to identify factors influencing spatial patterns in a mixed-effects framework. The articles examines patterns generated by a Poisson, Thomas or Strauss process, but not negative binomial.

jgabry commented 7 years ago

@ssridhara sorry for the slow response. I'm not aware of any good references on the beta-binomial in this context, although that doesn't mean there aren't any. I'm really not sure. A while back I was just playing around with it on my own in Stan, but it's been a while. There are definitely a few examples of CAR-style Stan models, including a somewhat recent post on Gelman's blog I believe. That might be a place to start.

On Saturday, October 8, 2016, ssridhara notifications@github.com wrote:

Thanks for the clarification. Any references to the beta-binomial models you talk about? I'm not really sure whether this http://dx.doi.org/10.1111/2041-210X.12335 is extensible to negative binomial distributions or even entirely relevant to SAR, but it demonstrates the use of K-Function to identify factors influencing spatial patterns in a mixed-effects framework. The articles examines patterns generated by a Poisson, Thomas or Strauss process, but not negative binomial.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/6#issuecomment-252454944, or mute the thread https://github.com/notifications/unsubscribe-auth/AHb4Qx2YUpslVXoRXS9_wyICpGHaE2Bxks5qyCzVgaJpZM4GLrEi .

mikejacktzen commented 7 years ago

I've been watching all things spatial in various packages, and have some thoughts (from my own experiences applying these kinds of models in STAN)

Whether 'lattice' (car, icar, mrf, etc) or 'geo-referenced' (gaussian processes', etc)

I think having the brms() as the user facing API to generate the back-end matrices to be used in STAN is indeed helpful. sidenote: The R-INLA guys have a pretty consistent way to do it.

Looking at the trial-error syntax in https://github.com/paul-buerkner/brms/issues/128#issuecomment-250912729

I think one improvement would ask the user to be more explicit in the arg options of s(): s(foo_x,type='car',foo_adjacency) s(foo_x,type='gp')

A key distinction between the two is:

  1. the lattice "sparse" type models basically depend crucially on some adjacency matrix (or sparse list). On the STAN side, it will need to declare the adjacency matrix in the data block. I've made some helper R functions myself to generate these adjacency related matrices that could be contributed here.

  2. for the gaussian process "dense" type of methods, you don't need an adjacency matrix, but you implicitly must compute some kind of many-many matrix of pairwise distances

At the current time, I think it's worth focusing the discussion on the api user facing design. I say this because, both 'sparse' lattice or dense 'gp' type models that use these spatial-relational matrices are intensive on the matrix algebra side.

A key feature missing from these omnibus stat modeling languages is integration with sparse matrix methods.

I know it's on the STAN team's radar, but it's still off in the horizon. I think there is an industry shortage on programmers that can implement sparse linear algebra (making use of C++ / boost / eigen)

paul-buerkner commented 7 years ago

Thanks for the suggestions!

I have no control over the argument options of s since it is a function of the mgcv package. That is, for spatial structures not implemented via s we would need a new function + related syntax.

Can you tell me, which of your options are not availabla via the current implementation of s? I assume all "lattice" options but I might be mistaken.

mikejacktzen commented 7 years ago

So after some more thought,

i think there's really 3 categories of how to allow spatial relations in a brm() formula call.

1) mgcv::s(bs='mrf',xt=foo_neighb_list) see https://stat.ethz.ch/R-manual/R-devel/library/mgcv/html/smooth.construct.mrf.smooth.spec.html

2) new_res(type='lattice',xt=foo_neighb_list) 3) new_res(type='gp')

'new_res' standing for new formula function for 'random effect spatial'

i think having a separate in formula special term other than mgcv::s() is probably the right way to go, conceptually.

My understanding of mgcv::s() is that by design, it is for smoothing terms, eg penalizing second derivatives (this is option 1).

Options 2 and 3, use spatial distributions (sparse lattice or dense matrix) for random/latent effects. see http://mc-stan.org/documentation/case-studies/mbjoseph-CARStan.html

BTW, looks like the 'sparse' implementation in the carstan case study uses a nice math trick stan-side to maybe avoid that intensive matrix algebra bottleneck i alluded to earlier

So for options 1 and 2, although a 'lattice' structure is used when talking about mgcv::s() and these car random effect examples in stan, they're doing different things (the former penalizing smoothness, and the latter modeling the variance components)

The way i see it, you can potentially have

  1. brm( ~ mgcv::s(bs='mrf',xt=foo_neighb_list))
  2. brm( ~ (1|new_res(type='lattice',xt=foo_neighb_list))
  3. brm( ~ (1|new_res(type='gp'))

Conceptually, i don't know what it would mean to follow up on option 1 brm( ~ mgcv::s(bs='mrf',xt=foo_neighb_list)) ?Further putting prior distributions on smooth-penalized spatial terms?

Options 2 and 3 (the random effect approaches) are pretty standard now

So to answer your question

Can you tell me, which of your options are not availabla via the current implementation of s? I assume all "lattice" options but I might be mistaken.

It would be all of 2 and 3 cannot be handled by 1) mgcv::s() since 2 and 3 are your 'traditional' random effects just with a spatial specific covariance while 1 is penalizing smoothness

paul-buerkner commented 7 years ago

After further discussion with some users via email, I think that some of the models implemented in the spdep package (e.g. lagsarlm or errorsarlm) should be reasonbly straight forward to implement at least when assuming normally distributed responses.

@mikejacktzen As of version 1.7.0, latent gaussian processes can be fitted in brms via function gp so this adds one of the discussed possibilities of controlling for spatial dependencies.

paul-buerkner commented 6 years ago

Just a quick note that spatial simultaneous autoregressive lagsarlm and errorsarlm models are implemented in the dev version via correlation structure cor_sar.

So I guess the only important spatial correlation structure still missing is CAR (conditional autoregressive structure).

mikejacktzen commented 6 years ago

This is fantastic @paul-buerkner i'm going to take these new features for a spin

paul-buerkner commented 6 years ago

Conditional autoregessive (CAR) structures are now implemented as well via function cor_car. With both SAR and CAR implemented (as well as having Gaussian processes and splines), is there anything else to consider with respect to spatial autocorrelation?

paul-buerkner commented 6 years ago

I am closing this issue now. New feature requests regarding spatial models should go into new issues so that it is easier to keep track of them.

waynefoltaERI commented 6 years ago

By the way, I mostly recreated the spatial example from mgcv's gam. I've slightly updated it to include the spline fit and to handle the polygon better.

library (sf)
library(mgcv)

data(columb)
data(columb.polys)

cp <- columb.polys
cp[[2]] <- cp[[2]][1:47,]   # Get rid of hole in #2, which sf doesn't like

xt <- list(polys=columb.polys) ## neighbourhood structure info for MRF

p <- lapply (cp, function (x) st_polygon (list (x)))

pp <- st_sfc (p)

plot (pp)

stt <- st_touches (pp)

w <- matrix (0, nrow=nrow (columb), ncol=nrow (columb))

for (i in 1:nrow (w)) {
   for (j in stt[[i]]) {
      w[i, j] <- 1
      w[j, i] <- 1
   }
}

rownames (w) <- names (cp)

library (brms)

# This doesn't converge, but the results are OK. Note that the `gam` has to take
# the district down to k=20, and that's because there's not enough data otherwise.

b1 <- brm (crime ~ s(income), data=columb, family=gaussian (), iter=6000, warmup=4000,
          autocor=cor_car (w, ~ 1 | district), control=list (adapt_delta=0.99))

b1b <- brm (crime ~ 1, data=colu, family=gaussian (), iter=6000, warmup=4000,
           autocor=cor_car (w, ~ 1 | district), control=list (adapt_delta=0.975))

b1c <- brm (crime ~ 1 + s(district, bs="mrf", xt=xt, k=20), data=columb, family=gaussian (), iter=6000, warmup=4000,
            control=list (adapt_delta=0.975))

b1d <- brm (crime ~ s(income) + s(district, bs="mrf", xt=xt, k=20), data=columb, family=gaussian (), iter=6000, warmup=4000,
            control=list (adapt_delta=0.985))

b2 <- gam (crime ~ s(income) + s(district, bs="mrf", xt=xt, k=20), data=columb, method="REML")

f1 <- fitted (b1)[,1]
names (f1) <- 0:(length (f1) - 1)

f2 <- fitted (b2)

polys.plot (cp, f1)
polys.plot (columb.polys, f2)

plot (f1, columb$crime) ; abline (0, 1)
plot (f2, columb$crime)     ; abline (0, 1)
paul-buerkner commented 6 years ago

Thanks wayne! Did you try out using an mrf spline in brms as well?

Am 06.08.2017 03:51 schrieb "Wayne Folta" notifications@github.com:

By the way, I mostly recreated the spatial example from mgcv's gam. I had to leave out one polygon, which has a hole, so it's not neat. For the record:

library (sf) library(mgcv)

data(columb) data(columb.polys)

columb.polys[2] has a hole and sf doesn't like the way it's done,

so I remove it

cp <- columb.polys[-2]

p <- lapply (cp, function (x) st_polygon (list (x)))

pp <- st_sfc (p)

plot (pp)

stt <- st_touches (pp)

colu <- columb[-2,] colu$district <- factor (1:nrow (colu))

w <- matrix (0, nrow=nrow (colu), ncol=nrow (colu))

for (i in 1:nrow (w)) { for (j in stt[[i]]) { w[i, j] <- 1 w[j, i] <- 1 } }

rownames (w) <- as.character (1:nrow (colu))

library (brms)

This doesn't converge, but the results are reasonable. Note that the

gam has to take

the district down to k=20, and that's because there's not enough

data otherwise.

No such choice with brms, so...

b1 <- brm (crime ~ s(income), data=colu, family=gaussian (), iter=6000, warmup=4000, autocor=cor_car (w, ~ 1 | district), control=list (adapt_delta=0.975))

b2 <- gam (crime ~ s(income) + s(district, bs="mrf", xt=xt, k=20), data=columb, method="REML")

f1 <- fitted (b1)[,1]

f2 <- fitted (b2)

names (f1)[1] <- "0"

polys.plot (cp, f1)

polys.plot (columb.polys, f2)

b1b <- brm (crime ~ 1, data=colu, family=gaussian (), iter=6000, warmup=4000, autocor=cor_car (w, ~ 1 | district), control=list (adapt_delta=0.975))

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/6#issuecomment-320480403, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVtADj2wLT2fgifhSAN3Tr4pKqVi4dOks5sVRw1gaJpZM4GLrEi .

waynefoltaERI commented 6 years ago

I updated my example to include the mgcv mrf approach. Also, cleaned up the polygon to make things more comparable. When I plot your predicted versus actual and gam's, yours is better.

paul-buerkner commented 6 years ago

That is good to hear!

2017-08-06 21:20 GMT+02:00 Wayne Folta notifications@github.com:

I updated my example to include the mgcv mrf approach. Also, cleaned up the polygon to make things more comparable. When I plot your predicted versus actual and gam's, yours is better.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/6#issuecomment-320526695, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVtAJw0YwVQx66Mo8nm2Qo7BbkX8NSFks5sVhIPgaJpZM4GLrEi .

Rolpio commented 6 years ago

Hi guys I am really striving with the creation of a adjency matrix file from a polygon shapefile. Can somebody guide me into creating this?

paul-buerkner commented 6 years ago

Not sure if this is the right place to ask, because brms just takes the matrix no matter where it came from. Maybe look in packages, which deal primarily with spatial correlation structures?

2017-10-26 18:13 GMT+02:00 Rolpio notifications@github.com:

Hi guys I am really striving with the creation of a adjency matrix file from a polygon shapefile. Can somebody guide me into creating this?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/6#issuecomment-339718456, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVtAMHKzpK0y1VTfDetMC-7dUgPF0wLks5swK--gaJpZM4GLrEi .

Rolpio commented 6 years ago

IHi Paul, first of all let me congratulate you with the big job done with brms. Actually I am implementing a multilevel model using brms, and I am completely new in brms. I would like to account for spatial autocorrelation in my model, That is why I need to create the "w" object.

Rolpio commented 6 years ago

Another question is about integrating negative binomial modeling into multilevel model with a Random effect. I have heard that if there is a random effect in a multilevel model no need of negative binomial the Poisson would be enough even in case of over dispersion. Is that true?

paul-buerkner commented 6 years ago

Thanks, but still I am not sure if I can help you since brms offers no functionality to get this matrix. However, as I said, there are quite a few R packages available dealing primarily with special data, which will likely help you getting the adjacency matrix.

2017-10-26 18:38 GMT+02:00 Rolpio notifications@github.com:

IHi Paul, first of all let me congratulate you with the big job done with brms. Actually I am implementing a multilevel model using brms, and I am completely new in brms. I would like to account for spatial autocorrelation in my model, That is why I need to create the "w" object.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/6#issuecomment-339725461, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVtAKqVvSbss8l-dwwIS9ZfHSVAfhrnks5swLV3gaJpZM4GLrEi .

paul-buerkner commented 6 years ago

This is a bit off-topic in this closed issue. Please ask the question on brms users: https://groups.google.com/forum/#!forum/brms-users

2017-10-26 18:42 GMT+02:00 Rolpio notifications@github.com:

Another question is about integrating negative binomial modeling into multilevel model with a Random effect. I have heard that if there is a random effect in a multilevel model no need of negative binomial the Poisson would be enough even in case of over dispersion. Is that true?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/6#issuecomment-339726654, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVtAEUJg-hI2CXx1GNs8vGvyoJpDcjpks5swLZqgaJpZM4GLrEi .

Rolpio commented 6 years ago

Ok thanks