ianjonsen / aniMotum

fit latent variable movement models to animal tracking data
https://ianjonsen.github.io/aniMotum/
Other
38 stars 13 forks source link

Bathymetry constraint function #7

Closed camrinbraun closed 1 year ago

camrinbraun commented 5 years ago

Problem Many coastal species have erroneous positions or, more frequently, predicted locations from the model fit are on land.

Solution example.zip Is there a way to implement a constraint function that prevents land from even being an option? I've done this in adehabitatLT when generating simulated tracks like:

# consfun.r and internal data .rda object containing bdf attached as example.zip
# note bdf is passed to consfun as par argument and is a SpatialPixelsDataFrame based on bathymetry, where 1 is possible (water) and NA is not (land)

source('consfun.r')
load('sysdata.rda')
library(adehabitatLT); library(foieGras)

# filter, fit and predict for ellie
data(ellie)
bathy <- raster::raster(bdf) # make bathy raster
ellie$bathy <- raster::extract(bathy, cbind(ellie$lon, ellie$lat))
ellie <- filter(ellie, bathy == 1) # get rid of points where ellie is on land
fit <- fit_ssm(ellie[,1:8], model = "rw", time.step = 24)
fls <- foieGras::grab(fit, 'pred', as_sf=F) # need regular time res for ltraj

# build ltraj object
tr1 <- adehabitatLT::as.ltraj(cbind(fls$lon, fls$lat), date=fls$date, id=as.factor(fls$id),
                              proj4string = sp::CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"))

crw <- adehabitatLT::NMs.randomCRW(na.omit(tr1), rangles=TRUE, rdist=TRUE,
                                   constraint.func = consfun,
                                   constraint.par = bdf, nrep = 10)

# simulate
tmp <- adehabitatLT::testNM(crw)

# add unique ids for each sim track for plotting
sim_out <- tmp[[1]] %>% do.call(rbind, .)
sim_out$id <- NA
idx <- c(1, which(diff(sim_out$date) < 1) + 1)
for (i in 1:(length(idx) - 1)) sim_out$id[c(idx[i]:(idx[i + 1] - 1))] <- i

# plot
data(countriesHigh, package = "rworldxtra", envir = environment())
wm <- suppressMessages(fortify(countriesHigh))
rm(countriesHigh)

xl <- extendrange(fls$lon, f = 0.1)
yl <- extendrange(sim_out$y, f = 0.1)

p1 <- ggplot() + geom_polygon(data = wm, aes_string(x = "long", y = "lat", group = "group"), fill = grey(0.3)) +
  coord_cartesian(xlim = xl, ylim = yl) + xlab("Longitude") + ylab("Latitude")
p1 <- p1 + geom_path(data = sim_out, aes_string(x = "x", y = "y", group = "id")) 
p1 <- p1 + geom_point(data = fls, aes_string(x = 'lon', y = 'lat', group = NULL))
p1

ellie_sims

ianjonsen commented 5 years ago

Something like this is in the works but so far it's proving to be rather computationally demanding. I'll have a look through your code to see if it spawns some new ideas. Or, if you want to take a stab you're welcome to contribute!

mdsumner commented 5 years ago

We do this pretty heavily in tripEstimation/SGAT, perhaps we can help speed it up?

Even a large grid like Etopo2 doesn't seem to be too slow, is this fast enough?

  ## generate rando coordinates upfront
ra <- lapply(1:1000, function(i) geosphere::randomCoordinates(2500))

library(raster)
#> Loading required package: sp

## make a function to encapsulate the grid
## so we can pass this into model funs
mkMask <- function(x) {
  ## use brick() for annoying technical reasons
  grid <- raster::brick(x < 0)
  function(pts) {
    v <- raster::extract(grid, pts)
    ## any NA values are outside grid range
    v[is.na(v)] <- FALSE
    v
  }
}

## a very low res etopo of only southern
## (we can get a better mask)
maskfun <- mkMask(quadmesh::etopo)

plot(ra[[300]], col = maskfun(ra[[300]]) + 1, ylim = c(-90, 90))

system.time({
for (i in seq_along(ra)) {
  tst <- maskfun(ra[[i]])
}
})
#>    user  system elapsed 
#>   0.887   0.020   0.907

## what if we use a much higher-res grid
maskfun2 <- mkMask(raadtools::readtopo("etopo2"))
plot(ra[[300]], col = maskfun2(ra[[300]]) + 1, ylim = c(-90, 90))


system.time({
  for (i in seq_along(ra)) {
    tst <- maskfun2(ra[[i]])
  }
})
#>    user  system elapsed 
#>   0.976   0.000   0.976

Created on 2019-04-23 by the reprex package (v0.2.1)

I reckon that hard part is actually getting a data set of reasonable resolution for the given model, but we could encode a masked Etopo2 or Etopo1 into this package, it's tiny as a compressed mask (much bigger in memory of course):

topo <- raadtools::readtopo("etopo2") <= 0

## attached here as a .zip, about 0.5 Mb
writeRaster(topo, "etopo2-oceanmask.tif", datatype = "INT1U", options = "COMPRESS=DEFLATE")

(Apologies for use of raadtools, which is not available without preparation, but I'm happy to provide data in any way). etopo2-oceanmask.zip

ianjonsen commented 5 years ago

Yeah, that's not really the hard part - though your raster wrangling will be much faster than mine. The hard bit is appropriately penalizing the joint log-likelihood, without adding too much extra computation time, so the optimal solution doesn't include locations on land. I know you and Spoon wrote your own MCMC in tripEstimation, is SGAT similar or does it use an ML approach?

mdsumner commented 5 years ago

Oh I see, sorry - it's similar in SGAT. It was easier to ensure that candidate locations passed the mask upfront which was always a hassle. I tried a likelihood-mask on distance to coast so that points would be nudged towards the ocean but it never worked well. I'll have a closer look

ianjonsen commented 5 years ago

"...but it never worked well." sounds sooo familiar in this case!

ianjonsen commented 2 years ago

@camrinbraun the preferred option as of v 1.0-5 is to use route_path() after fitting an SSM to track data. This leverages @jmlondon's pathroutr pkg https://github.com/jmlondon/pathroutr to reroute locations that are erroneously on land. Currently, route_path() uses the rnaturalearth medium or high-res polygon land data but for a future update I hope to explore additional options such as bathymetry strata.

Here's an example:

Screen Shot 2022-05-02 at 12 40 32 PM
ianjonsen commented 1 year ago

Closing this as route_path() and potential function options in sim_fit() are intended to provide a land constraint, which could be adapted by users to impose an arbitrary bathymetric constraint - though this might require users to engage more directly with the pathroutr package (route_path() is just a wrapper function for pathroutr).