dipetkov / eems

Estimating Effective Migration Surfaces
GNU General Public License v2.0
102 stars 28 forks source link

Combining results from different runs #4

Closed jessicarick closed 8 years ago

jessicarick commented 8 years ago

I have a question about combining the results from different runs. The manual and the manuscript both recommend running the program with different numbers of demes, and then averaging the results from those different runs. However, I can't seem to figure out how to average different runs. How would I go about doing this?

Thanks!

dipetkov commented 8 years ago

How to combine multiple EEMS runs (for the same African elephant dataset)

library(dplyr,warn.conflicts=FALSE) ## Useful package for manipulating data frames
library(rEEMSplots) ## Defines eems.plots, which visualizes EEMS results

There are results from running EEMS once with four different densities: nDemes = 360, 540, 900, 1200.

eems.output = list.dirs("./elephants",recursive = FALSE) ## Show thesubdirectories in ./elephants/
eems.output
## [1] "./elephants/V3b-nohyb-nIndiv1124-nDemes1200-chain1"
## [2] "./elephants/V3b-nohyb-nIndiv1124-nDemes360-chain1" 
## [3] "./elephants/V3b-nohyb-nIndiv1124-nDemes540-chain1" 
## [4] "./elephants/V3b-nohyb-nIndiv1124-nDemes900-chain1"

Let's look at the results separately for each of the four runs. First generate the figures by calling eems.plots.

for (run in seq(4)) {
  eems.plots(mcmcpath = eems.output[run],
             plotpath = eems.output[run],
             longlat = TRUE,
             add.grid = TRUE,
             add.demes = TRUE)
}

Here nDemes = 360: combine runs-unnamed-chunk-5-1

Here nDemes = 540: combine runs-unnamed-chunk-6-1

Here nDemes = 900: combine runs-unnamed-chunk-7-1

Here nDemes = 1200: combine runs-unnamed-chunk-8-1

In each plot, I have added the population grid to illustrate that the density is different but the surfaces are qualitatively similar. Still, with higher density, the effective barrier to migration -- the area in dark brown that curves through the habitat -- gets to be estimated more precisely. (By this I mean that it is wider in some places and thinner in others.)

Next, let's combine combine the four results. Remember that eems.output is a list of four EEMS output directories.

## mcmcpath can be (ideally, will be) a list of multiple EEMS runs to average
combined.mcmcpath = eems.output
## plotpath should be one string (not a list of strings) for the averaged results
combined.plotpath = "./elephants/V3b-nohyb-nIndiv1124-combined-results"

eems.plots(mcmcpath = combined.mcmcpath,
           plotpath = combined.plotpath,
           longlat = TRUE)

combine runs-unnamed-chunk-9-1

That is, eems.plots works with one or mulitple EEMS output directories. The estimated migration rates (and diversity rates -- not shown above) are averaged to produce a surface/contour plot that often smoother and, more importantly, averages the effects of different population grids. [In practice, I run the program several times for a few different grids.]

While it is straightforward to combine the migration and diversity contour plots for multiple grids/runs, the diagnostic scatter plots are not so easy to combine. Recall that there is a diagnostic scatter plot that shows the observed and fitted dissimilarities between demes.

Between-demes dissimilarities for nDemes = 360: combine runs-unnamed-chunk-10-1

There is also a diagnost scatter plot that shows the observed and fitted dissimilarities within demes. Within-demes dissimilarities for nDemes = 360 combine runs-unnamed-chunk-11-1

However, as the density varies and the population grid changes, the demes change as well -- most obviously, the number of demes but also their location. This means that the points in the between-demes scatter plot for nDemes = 360 do not correspond to the points in the between-demes scatter plot for nDemes = 540.

So if mcmcpath contains results for different population grids, eems.plots does not generate diagnostic plots by default. Instead it plots two different population grids. These contain no information but "explain" why there are no scatter plots.

Here nDemes = 1200: combine runs-unnamed-chunk-12-1

Here nDemes = 360: combine runs-unnamed-chunk-13-1