rbchan / unmarked

R package for hierarchical models in ecological research
https://rbchan.github.io/unmarked/
37 stars 25 forks source link

Spatial autocorrelation #22

Open rbchan opened 13 years ago

rbchan commented 3 years ago

Should be possible to add Chandler-Royle (2013, AOAS) model. Marginal likelihood could be computed (and optimized) by integrating over Poisson prior on N. For each plausible value of N, must integrate over all possible configurations of the activity centers (ie, over the realized point pattern). This could be done using Monte Carlo integration, which will be slow, but can be run in parallel.

# Conditional (on y, s, and N) likelihood
cL(n; y,s,N,lam0,sigma,mu) = p(n|y)p(y|s,lam0,sigma,N)p(s,N|mu)

where n is nTraps x nOccasions matrix of counts (or binary), y is latent N x nTraps x nOccasions array of encounter histories; lam0 and sigma are encounter rate parameters; and p(s,N|mu) is the Poisson point process. The encounter rate could be lam(i,j,)=lam0(j,k)*exp(-dist(s,x)^2/(2sigma^2)) where x is nTraps by 2 matrix of trap coordinates. If y(i,j,k)~Pois(lam(i,j,k)), then we can simplify p(n|y) from n(j,k) = \sum_i y(i,j,k) to n(j,k)~Pois(\sum_i lam(i,j,k)):

# Conditional (on s and N) likelihood
cL(n;s,N,lam0,sigma,mu) = p(n|s,lam0,sigma,N)p(s,N|mu)

Since the point pattern -- p(s,N) -- is latent, we have to integrate over it

# Marginal likelihood
L(n; lam0,sigma,mu) = \sum_{N=0}^M \int_S p(n|s,lam0,sigma,N)p(s,N|mu) ds

where M should be infinity but has to be specified by the user to be an integer >>N (as in N-mixture models). The integral is over all possible point patterns. It is not an integral over each point independently. Because it's an N-dimensional integral, Monte Carlo integration is likely the best bet. It will simplify things to factor p(s,N|mu) to p(s|N,mu)p(N|mu):

# Marginal likelihood
L(n; lam0,sigma,mu) = \sum_{N=0}^M \int_S p(n|s,lam0,sigma,N)p(s|N,mu) ds p(N|mu)

Note that S is not the spatial region where the activity centers occur -- it is the realized point pattern: S={s_i,...,s_N}, with $$S \in Q and Q \subset R^2$$. Here, Q is the spatial region (observation window) within which S occurs.

Note also that data augmentation isn't necessary.

My guess is that if M<1000 and n has fewer than 1000 elements, then the marginal likelihood could be computed in less than a minute, with help from C++ code and OpenMP. This would mean that it could take an hour or more for optimization.