tskit-dev / msprime

Simulate genealogical trees and genomic sequence data using population genetic models
GNU General Public License v3.0
172 stars 84 forks source link

Implement 2D stepping stone model #2100

Open connor-french opened 2 years ago

connor-french commented 2 years ago

Hello all,

I am interested in simulating large "landscapes" of connected demes parameterized by species distribution models for neutral demographic inference in non-model species. A tough line I'm trying to balance in this type of analysis is realism (e.g. forward-time simulations in SLiM or SPLATCHE) vs speed. I'm simulating long timescales (~20,000 generations) with large landscapes (~5x10^4 to 5x10^5 demes) and many individuals (10s of millions if taking a forward-time perspective), which makes forward simulations intractable when the final goal is to use simulation output for inference, where 10's of thousands of simulations are generally necessary. This is a problem faced by quite a few of my colleagues, who straddle the line between phylogeography and landscape genetics.

I believe applying a 2D stepping stone model would facilitate this type of analysis, especially when combined with msprime's flexibility and speed. I'm aware of a recent step towards this through the gridCoal software, but it limits its output to coalescent times which, although useful, constrains its flexibility. I'm not suggesting something with the specific application domain of gridCoal, which contains functionality for pre-processing data for the type of data analysis I outlined above, but I think a Demography helper to setup 2D stepping stone models of arbitrary size would be a great addition to the "spatial genetics" toolkit.

I appreciate the fantastic work you've done on this software!

Best, Connor

jeromekelleher commented 2 years ago

Hi @connor-french ! :wave:

I think it should be fairly straightforward to implement in the stepping_stone_model

Here's a first pass:

import msprime
import numpy as np

def stepping_stone2d(initial_size, rate):
    assert len(initial_size.shape) == 2
    n, m = initial_size.shape

    N = n * m
    model = msprime.Demography.isolated_model(initial_size.reshape(N))
    M = model.migration_matrix
    for j in range(n):
        for k in range(m):
            index = j * m + k
            model.populations[index].name = f"pop_{j}_{k}"
            M[index, index - 1] = rate
            M[index, (index + 1) % N] = rate
            M[index, index - m] = rate
            M[index, (index + m) % N] = rate

            M[index - 1, index] = rate
            M[(index + 1) % N, index] = rate
            M[index - m, index] = rate
            M[(index + m) % N, index] = rate
    return model

model = stepping_stone2d(np.zeros((3, 3)) + 5, 1)

print(model)

This gives us:

Demography
╟  Populations
║  ┌──────────────────────────────────────────────────────────────────────────────────────────────────┐
║  │ id │name     │description  │initial_size  │ growth_rate │  default_sampling_time│extra_metadata  │
║  ├──────────────────────────────────────────────────────────────────────────────────────────────────┤
║  │ 0  │pop_0_0  │             │5.0           │      0      │                      0│{}              │
║  │ 1  │pop_0_1  │             │5.0           │      0      │                      0│{}              │
║  │ 2  │pop_0_2  │             │5.0           │      0      │                      0│{}              │
║  │ 3  │pop_1_0  │             │5.0           │      0      │                      0│{}              │
║  │ 4  │pop_1_1  │             │5.0           │      0      │                      0│{}              │
║  │ 5  │pop_1_2  │             │5.0           │      0      │                      0│{}              │
║  │ 6  │pop_2_0  │             │5.0           │      0      │                      0│{}              │
║  │ 7  │pop_2_1  │             │5.0           │      0      │                      0│{}              │
║  │ 8  │pop_2_2  │             │5.0           │      0      │                      0│{}              │
║  └──────────────────────────────────────────────────────────────────────────────────────────────────┘
╟  Migration Matrix
║  ┌───────────────────────────────────────────────────────────────────────────────────────────────────┐
║  │         │ pop_0_0 │ pop_0_1 │ pop_0_2 │ pop_1_0 │ pop_1_1 │ pop_1_2 │ pop_2_0 │ pop_2_1 │ pop_2_2 │
║  ├───────────────────────────────────────────────────────────────────────────────────────────────────┤
║  │  pop_0_0│    0    │    1    │    0    │    1    │    0    │    0    │    1    │    0    │    1    │
║  │  pop_0_1│    1    │    0    │    1    │    0    │    1    │    0    │    0    │    1    │    0    │
║  │  pop_0_2│    0    │    1    │    0    │    1    │    0    │    1    │    0    │    0    │    1    │
║  │  pop_1_0│    1    │    0    │    1    │    0    │    1    │    0    │    1    │    0    │    0    │
║  │  pop_1_1│    0    │    1    │    0    │    1    │    0    │    1    │    0    │    1    │    0    │
║  │  pop_1_2│    0    │    0    │    1    │    0    │    1    │    0    │    1    │    0    │    1    │
║  │  pop_2_0│    1    │    0    │    0    │    1    │    0    │    1    │    0    │    1    │    0    │
║  │  pop_2_1│    0    │    1    │    0    │    0    │    1    │    0    │    1    │    0    │    1    │
║  │  pop_2_2│    1    │    0    │    1    │    0    │    0    │    1    │    0    │    1    │    0    │
║  └───────────────────────────────────────────────────────────────────────────────────────────────────┘
╟  Events
║  ┌───────────────────────────────────┐
║  │  time│type  │parameters  │effect  │
║  ├───────────────────────────────────┤
║  └───────────────────────────────────┘

Does that look roughly right to you?

There would be some tedious details around detail with the boundaries or not, but hey.

connor-french commented 2 years ago

@jeromekelleher Wow, this is great! It definitely gets the ball rolling, thank you. I didn't think it could be this straightforward. I agree, the fun part now is dealing with the edges. I appreciate the help!

-Connor

jeromekelleher commented 2 years ago

Great! If you get it working, then it would be great to add to the API. PR welcome!