dipetkov / eems

Estimating Effective Migration Surfaces
GNU General Public License v2.0
102 stars 28 forks source link

Error starting EEMS #26

Open evasylvester opened 6 years ago

evasylvester commented 6 years ago

Hi Dr. Petkova,

I'm trying to run eems_sats. I've got it working with one set of data, so it seems to be running fine, but not with the other set. All my input files and input parameters read in properly but then I see:

Initial log prior: 97.3 Initial log llike: inf

[RunEEMS] Error starting EEMS.

I realize this might be difficult to disentangle without my files, but I've compared the inputs with those that have worked and do not see any differences. The genotype files look very similar, the size of the .sites matrix and .coord file matches the parameters, and the .outer is a simple rectangle. Is there perhaps something with the data itself that is causing this problem? Is 'initial log llike: inf' a problem? In runs that have worked for me this number has been a large negative number.

Any help would be greatly appreciated.

Many Thanks, Emma

dipetkov commented 6 years ago

This is a problem in the sense that both the prior and the likelihood should be finite numbers.

eems generates some output at the start, mostly related to the grid and how the samples are assigned to demes in the grid. Looking at a plot of just the grid before any parameters are estimated might help. For example, it would be a problem if in the .coord file locations are specified as (latitude, longitude) and in the .outer file locations are specified as (longitude, latitude).

## Use the provided example or supply the path to your own EEMS run.
extdata_path <- system.file("extdata", package = "rEEMSplots")
eems_results <- file.path(extdata_path, "EEMS-example")
name_figures <- file.path(path.expand("~"), "EEMS-grid_connected")

eems.population.grid(eems_results,
                     name_figures,
                     longlat = TRUE,
                     add.outline = TRUE, col.outline = "purple",
                     add.grid = TRUE, col.grid = "green",
                     add.demes = TRUE, col.demes = "blue")

I will probably start here.

dipetkov commented 6 years ago

The next step might be to generate a .diffs matrix and use that together with the real coordinates, so that you can check whether the problem might be in the dissimilarity matrix. (I have an R script to generate a dissimilarity matrix.)

evasylvester commented 6 years ago

Thanks for such a quick reply!

Both the .outer and .coord file are in 'long, lat' format, and all samples are assigned to demes according to the initial output from the program. To clarify your next suggestion, should I create the .diffs matrix and run eems_snps, or will eems_sats accept a .diffs file instead of a .sites file as well?

Thanks again, Emma

On Tue, May 1, 2018 at 11:34 AM Desislava Petkova notifications@github.com wrote:

The next step might be to generate a .diffs matrix and use that together with the real coordinates, so that you can check whether the problem might be in the dissimilarity matrix. (I have an R script to generate a dissimilarity matrix.)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dipetkov/eems/issues/26#issuecomment-385685638, or mute the thread https://github.com/notifications/unsubscribe-auth/AlFYK-QvdqGO76MPvN8eqr1aYfJqN5Leks5tuHJ-gaJpZM4Tt-ZN .

dipetkov commented 6 years ago

Hello Emma,

Sorry about not getting back to your question sooner. If you still have this issue and are interested in looking into it, my suggestion is to use simulated diffs matrix with the real coordinates, and simulated coordinates with the real matrix. [Hopefully, this will help to find more precisely where the problem with the likelihood is.] It is easy to simulate coordinates -- they don't need to be actual latitudes/longitudes to run eems and make the plots.

The runeems_sats version won't work with a .diffs matrix but you can use the runeems_snps version with that.

I suspect that computing the likelihood ends up in an Inf, though I don't know why. How many individuals and how many microsatellites do you have? It might also be interesting to use subset(s) of your actual data to check if a particular subset of individuals and/or microsatellites causes the problem.

LZarri commented 7 months ago

I'm going to add to this here, as I'm having the same issue. My population .outer file is connected and going counterclockwise. I am looking at a relatively small area (4 populations across a few kilometers of river) and the populations are fairly differentiated, particularly in the upper parts of the watershed (to the East). Here is a Jost D pairwise differentiation plot, where 1 and 4 are downstream, 2 is upstream and above a dam, and 3 is farther upstream above another dam:

image

I've attached my directory of input files, as well as the output files generated, with the screenshot of the error below. I'm using a standard run with the starting code /programs/eems/runeems_sats/src/runeems_sats --params eems_master.ini.

image

Let me know if there is anything else I can provide, and thank you!

eems_output.zip eems_input.zip