danlwarren / ENMTools

ENMTools R Package
100 stars 30 forks source link

Add cox process SDM using r-inla #86

Open rdinnager opened 6 years ago

rdinnager commented 6 years ago

I discovered a cool new SDM method. Using a cox process model in the r-inla package (https://www.math.ntnu.no/inla/r-inla.org/tutorials/spde/spde-tutorial.pdf; see Chapter 4). r-inla uses integrated nested laplace approximation to fit a Bayesian model (approximation avoids MCMC so it is fast!). The cool thing about these models is that it is relatively straightforward to include spatial covariance in the model, which few SDM methods do. It is also easy to include arbitrary covariance stuctures in the model, meaning this could be the base for another simple phylogenetic SDM method (e.g. include phylogenetic covariance matrix based on Brownian motion as a 'latent' effect on the environmental coefficients). I want to implement an enmtools wrapper for this, if it is cool with you? It might be a little more involved that some of the others, because the spatial part requires the user to construct a spatial 'mesh', which is not easy to do automatically. I was thinking of including a parameter where the user passes in the mesh (the r-inla package itself has a 'mesh-builder' helper app built in Shiny, which I can point ENMTools users to for help with making the mesh). This r-inla package is my new obsession at the moment, I've been building all sorts of cool models with it, it is majorly powerful. Would highly recommend.

danlwarren commented 6 years ago

Awesome! I'd read about r-inla but not in depth, and I've never played with it.

Is the spatial mesh dependent on the occurrence data? If so it could be challenging to implement in a way that would be usable within the hypothesis tests; it would require rebuilding the mesh for each permutation, which would obviously not be something you'd want to do by hand.

danlwarren commented 6 years ago

Similarly I've been playing with MCMCglmm lately and have been thinking about implementing that for ENMs in ENMTools as well.

rdinnager commented 6 years ago

Oh cool, yes, an MCMCglmm SDM would be cool.

As far as the spatial mesh, it is not dependent on the occurrence points. It is kind of cool actually. It is similar to a set of background points, except each point is a vertice of a mesh of triangles. The model can then estimate the value of any point (for example, the occurrence points) by triangulation (or rather, triangular interpolation). So the model actually estimates the density 'field' at the mesh points, but the occurrence points can be anywhere within the mesh, not necessarily on the vertices.

By the way, I also want to implement some more machine learny methods, because it is looking like they are the best for the phylogenetic eigenvector approach (because they more easily ignore irrelevant variables, of which there are many in the eignevectors). So right now I am just implementing with random forest, but it would be good to get some artificial neural network / deep learning methods in ENMTools. I am thinking we could use the keras or h2o packages. I'll have the phylogenetic random forest SDM method ready for testing within the next few weeks.

danlwarren commented 6 years ago

Awesome!

I was thinking about keras as well the other day, it seems potentially really interesting. I do have general standing concerns about machine learning methods making overly complex models fit to nuisance biogeographic processes, but presumably there's some way to tune complexity.