BoPeng / simuPOP

A general-purpose forward-time population genetics simulation environment.
http://bopeng.github.io/simuPOP/
GNU General Public License v2.0
29 stars 12 forks source link

Subpopulation statistics #109

Closed cruzan-lab closed 10 months ago

cruzan-lab commented 11 months ago

Hi Bo, I've tried several different methods, but have not been able to extract individual subpopulation allele frequencies. For example, the command: allelefreq0 = [simu.dvars(x).alleleFreq[0][0] for x in range(Nreps)] gives the average allele frequency across all subpopulations.

I'm trying to set up a simple simulation so students can see the interplay between gene flow and selection in a continent-island model (two subpops; allele frequency is constant in one, gene flow is only from the continent to the island). To do this, they need to track allele frequency in one subpopulation over generations and then also at the end of the simulation. I have not been able to find any example in the documentation that extracts individual subpopulation statistics.

If you could please provide a simple example, that would be very helpful.

Mitch

BoPeng commented 11 months ago
import simuPOP as sim

pop = sim.Population([1000]*2, loci=[1]*10)
pop.evolve(
    # use prop instead of freq to make alleleFreq == 0.5, just for demonstration
    initOps=[sim.InitSex(), sim.InitGenotype(loci=sim.ALL_AVAIL, prop=[0.5, 0.5])],
    # these happen before mating at each generation, but you can do this
    # after mating using option postOps, then finalOps is no longer needed.
    preOps=[
        sim.Stat(alleleFreq=0, subPops=[0, 1], vars=['alleleFreq', 'alleleFreq_sp']),
        sim.PyEval(r'''f"{gen}: {subPop[0]['alleleFreq'][0][0]}\n"''')
    ],
    matingScheme=sim.RandomMating(),
    # these happen after simulation
    finalOps=[
        sim.Stat(alleleFreq=0, subPops=[0, 1], vars=['alleleFreq', 'alleleFreq_sp']),
        sim.PyEval(r'''f"{gen}: {subPop[0]['alleleFreq'][0][0]}\n"''')
    ],
    gen=10,
)
BoPeng commented 11 months ago

Using stat instead of Stat

import simuPOP as sim

pop = sim.Population([1000]*2, loci=[1]*10)
pop.evolve(
    # use prop instead of freq to make alleleFreq == 0.5, just for demonstration
    initOps=[sim.InitSex(), sim.InitGenotype(loci=sim.ALL_AVAIL, prop=[0.5, 0.5])],
    # these happen before mating at each generation, but you can do this
    # after mating using option postOps, then finalOps is no longer needed.
    preOps=[
        sim.Stat(alleleFreq=0, subPops=[0, 1], vars=['alleleFreq', 'alleleFreq_sp']),
        sim.PyEval(r'''f"{gen}: {subPop[0]['alleleFreq'][0][0]}\n"''')
    ],
    matingScheme=sim.RandomMating(),
    gen=10,
)

sim.stat(pop, alleleFreq=0, subPops=[0, 1], vars=['alleleFreq', 'alleleFreq_sp'])
print(pop.dvars().subPop[0])
BoPeng commented 11 months ago

Using a simulator

import simuPOP as sim

pop = sim.Population([1000]*2, loci=[1]*10)
simu = sim.Simulator(pop, rep=5)
simu.evolve(
    # use prop instead of freq to make alleleFreq == 0.5, just for demonstration
    initOps=[sim.InitSex(), sim.InitGenotype(loci=sim.ALL_AVAIL, prop=[0.5, 0.5])],
    # these happen before mating at each generation, but you can do this
    # after mating using option postOps, then finalOps is no longer needed.
    preOps=[
        sim.Stat(alleleFreq=0, subPops=[0, 1], vars=['alleleFreq', 'alleleFreq_sp']),
        sim.PyEval(r'''f"{rep}/{gen}: {subPop[0]['alleleFreq'][0][0]}\n"''')
    ],
    matingScheme=sim.RandomMating(),
    finalOps=[
        # calculate but not output,
        sim.Stat(alleleFreq=0, subPops=[0, 1], vars=['alleleFreq', 'alleleFreq_sp']),
    ],
    gen=10,
)

# this is the result of the last `sim.Stat` call, namely th eone from finalOps
print(simu.dvars(0).subPop[0])
BoPeng commented 11 months ago

Please check http://bopeng.github.io/simuPOP/userGuide_ch5_sec11.html#allele-count-and-frequency for more information.

BoPeng commented 11 months ago

Better output for multiple replicates

import simuPOP as sim

pop = sim.Population([1000]*2, loci=[1]*10)
simu = sim.Simulator(pop, rep=5)
simu.evolve(
    # use prop instead of freq to make alleleFreq == 0.5, just for demonstration
    initOps=[sim.InitSex(), sim.InitGenotype(loci=sim.ALL_AVAIL, prop=[0.5, 0.5])],
    # these happen before mating at each generation, but you can do this
    # after mating using option postOps, then finalOps is no longer needed.
    matingScheme=sim.RandomMating(),
    postOps=[
        sim.Stat(alleleFreq=0, subPops=[0, 1], vars=['alleleFreq', 'alleleFreq_sp']),
        sim.PyEval('gen',reps=0),
        sim.PyEval(r'''f"\t{subPop[0]['alleleFreq'][0][0]}"'''),
        sim.PyOutput('\n',reps=-1)
    ],
    gen=10,
)
cruzan-lab commented 11 months ago

Thanks for these examples - very helpful. I actually had the correct code but must have had a syntax error.

A followup question: I'm trying to use the migrIslandRates in the Demography module and it insists that this this function is nor recognized. Do I need to import this function before using it?

BoPeng commented 11 months ago

Did you have the following in your code?

from simuPOP.demography import migrIslandRates
cruzan-lab commented 11 months ago

OK. I do now (it was not clear from the documentation) I have migrIslandRates(mrate, n) in the preOps But now I have a new error message that is hard to decipher: TypeError in method "simulator_evolve" ...

cruzan-lab commented 11 months ago

In preOps= it's unclear how to invoke the function. The documentation says it can be plugged into the sim.Migrator operator, but I get an error message saying the number of from subpopulations should match the number of rows. I could not find an example of how to use a function like this.

BoPeng commented 11 months ago

https://bopeng.github.io/simuPOP/userGuide_ch5_sec4.html#migration-by-probability

This function just returns a matrix that can be used in a Migrator. If you do not specify parameter from (source subpopulation), it needs to have the same number of rows as the number of subpopulations.

cruzan-lab commented 11 months ago

It says: migrIslandRates(r, n) returns a n×nmigration matrix

I'm using: Npops = nn size=[popsize]Npops in sim.Population sim.Migrator(migrIslandRates(mrate, n)) in preOps

Still getting the from subpopulations should match the number of rows

BoPeng commented 11 months ago

What is n? Why n*n?

BoPeng commented 11 months ago

If you have a grid of nxn subpopulations, the dimension for the matrix should be n*n

cruzan-lab commented 11 months ago

Yes. The documentation says n is the number of rows in the matrix. Npops is the number of subpopulations. It's calculated before evolve.

BoPeng commented 11 months ago

https://github.com/BoPeng/simuPOP/blob/50f4776530cd2904196e56927000357da9a564b4/src/utils.py#L114-L134

You are right, n should be the number of rows, but this function is not used for a grid. It is used for n subpopulations and you have equal probability to migrate to any other subpopulation.

If you are simulating a grid, you will have to consider neighbors and boundary conditions and the migration matrix will be different. Maybe you are looking for a steping stone model?

https://github.com/BoPeng/simuPOP/blob/50f4776530cd2904196e56927000357da9a564b4/src/utils.py#L194

cruzan-lab commented 11 months ago

No, I am not simulating a grid, just the standard island model (sorry for the confusion).

The instructions are confusing because it says: "migrIslandRates(r, n) returns a n×n migration matrix"

Now I uderstand that n is the total number of populations.

Thank you for your help.

Mitch

BoPeng commented 11 months ago

$m_ij$ in the nxn matrix is basically the migration rate from subpop $i$ to $j$.

cruzan-lab commented 10 months ago

You can close this issue.