BoPeng / simuPOP

A general-purpose forward-time population genetics simulation environment.
http://bopeng.github.io/simuPOP/
GNU General Public License v2.0
31 stars 12 forks source link

complex demography simulation #8

Closed fmarande closed 7 years ago

fmarande commented 8 years ago

Dear Mr Peng,

I'am currently trying to simulate a complex demography population with specific age-survival, sexual maturity at age 5 and a maximum age at 15 years. My code is largely inspired from Tiago Antao book - Bioinformatics with Python Cookbook and I've to say I'm quite new with python and simuPOP. Final aim is to simulate a large number of individuals (>1 000 000) and a large number of SNPs (>2 000) for assessing the power of effective population size estimators for large natural populations.

My code works relatively well : sex ratio is around 1:1, age structure is stable after 10 generations and the export of SNPs data set is working but my code sometimes doesn't work (the frequence of the error increases with the population size).

Mating scheme 0 in subpopulation 0 failed to produce 407 offspring. Traceback (most recent call last):

File "", line 155, in gen=num_gens)

RuntimeError: utility.cpp: 1101 Iterator returns NULL item.

As this error occurs sometimes, I've the feeling (maybe wrong) that it's due to the initialisation process. Maybe the age structure initialisation which is currently randomly assigned :

init_ops['Age'] = sp.InitInfo(lambda: random.randint(0, len(survival_male) - 1), infoFields='age')

Is there a way to assign age structure with frequency ? (If it's in the simuPOP help, I'm sorry I didn't find it) to avoid an assignment of only immature individuals.

I'm not sure it is the issue if you have any other idea just tell me. I'm starting to be desperate about finding what's wrong....

Thank you very much ! Best Regards,

Florianne

PS : the complet code is above

# -*- coding: utf-8 -*-
"""
Created on Wed Oct 19 18:25:50 2016

@author: fmarande
"""

###########################################
#Modules Import
###########################################

from __future__ import division
from collections import OrderedDict
import random

import simuPOP as sp
from simuPOP.utils import export ## FUNCTION TO EXPORT DATA 

###########################################
#Usefull functions definitions
###########################################

#Remove individuals according to their survival-at-age
# adapted from Tiago Antao book                
def kill(pop):
    kills = []
    for i in pop.individuals():
        if i.sex() == 1:
            #remove males according to their survival curve
            cut = pop.dvars().survival_male[int(i.age)]
        else:
            #remove females according to their survival curve        
            cut = pop.dvars().survival_female[int(i.age)]
        if random.random() > cut:
            kills.append(i.ind_id)
    #remove selected individuals
    pop.removeIndividuals(IDs=kills)
    return True

#Choice of the reproductive individuals
# adapted from Tiago Antao book                
def choose_parents(pop):
    #name convention required
    fathers = []
    mothers = []

    for ind in pop.individuals():
        if ind.sex() == 1:
            #choice of the male according to their fecundity
            fathers.extend([ind] * pop.dvars().male_age_fecundity[int(ind.age)])
        else:
            #choice of the female according to their fecundity
            mothers.extend([ind] * pop.dvars().female_age_fecundity[int(ind.age)])
    while True:
        #random choice of parents in the parents pool
        father = random.choice(fathers)
        mother = random.choice(mothers)
        yield father, mother

#function for having a constant population size 
# adapted from Tiago Antao book                
def calc_demo(gen, pop):
        add_females = len([ind for ind in pop.individuals([0, 2]) if ind.sex() == 2])
        return pop_size + add_females

#Mating scheme function for keeping already here individuals and adding newborns
# adapted from Tiago Antao book                
mating_scheme = sp.HeteroMating([
    sp.HomoMating(
        sp.PyParentsChooser(choose_parents),
        sp.OffspringGenerator(numOffspring=140, 
                              ops=[sp.MendelianGenoTransmitter(),
                                   sp.IdTagger()],
                              sexMode=(sp.PROB_OF_MALES, 0.5)),
        weight=1),
    sp.CloneMating(weight=-1)],
    subPopSize=calc_demo)

###########################################
#General settlement
###########################################

num_rep=1 #Number of replicas
pop_size = 1000
num_loci = 200
num_alleles = 2 #SNPs

#Simulations lengths
num_gens = 20 #10 years enough for reach an equilibrium
#Fecundity
#maturity at age 6
male_age_fecundity = [0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1]
female_age_fecundity = [0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1]

#Survival
#senescence from age 12
survival_male = [0.2, 0.4, 0.55, 0.67,0.73,0.77,0.80,0.82,0.83,0.85,0.86,0.87,0.8,0.65,0.5,0.33,0.17,0.01]
survival_female = [0.2, 0.4, 0.55, 0.67,0.73,0.77,0.80,0.82,0.83,0.85,0.86,0.87,0.8,0.65,0.5,0.33,0.17,0.01]

#Population set-up
pops = sp.Population(pop_size, loci=[1] * num_loci, 
                     infoFields=['age', 'ind_id'],
                     alleleNames=['1','2'])
pops.setVirtualSplitter(sp.InfoSplitter(field='age', 
                                        cutoff=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]))

###########################################
#Operations settlement
###########################################

init_ops = OrderedDict()
pre_ops = OrderedDict()
post_ops = OrderedDict()

#Age varying demographic variables
def init_age(pop):
    pop.dvars().male_age_fecundity = male_age_fecundity
    pop.dvars().female_age_fecundity = female_age_fecundity
    pop.dvars().survival_male = survival_male
    pop.dvars().survival_female = survival_female
    return True

init_ops['Sex'] = sp.InitSex(maleFreq=0.5)
init_ops['ID'] = sp.IdTagger()
init_ops['Freq'] = sp.InitGenotype(freq=[1 / num_alleles] * num_alleles)
init_ops['Age-prepare'] = sp.PyOperator(init_age)
init_ops['Age'] = sp.InitInfo(lambda: random.randint(0, len(survival_male) - 1), infoFields='age')
pre_ops['Kill'] = sp.PyOperator(kill)
pre_ops['Age'] = sp.InfoExec('age += 1')
post_ops['savepop'] = sp.SavePopulation('!"./savepop%d%d.pop"%(rep,gen)')

###########################################
#Simulations
###########################################

sim = sp.Simulator(pops, rep=1)
sim.evolve(initOps=init_ops.values(), 
           preOps=pre_ops.values(),
           postOps=post_ops.values(),
           matingScheme=mating_scheme, 
           gen=num_gens)

###########################################
#Exportation csv et FSTAT
###########################################

for x in range (num_rep):
    for g in range(num_gens):
        pop = sp.loadPopulation('./savepop%d%d.pop'%(x,g))
        export(pop,infoFields=['age', 'ind_id'],
            format='csv',output='./myPOP_'+str(x)+'_'+str(g)+'.csv', gui=False)
        export(pop,infoFields=['age', 'ind_id'],
            format='FSTAT',output='./myPOP_'+str(x)+'_'+str(g)+'.dat', gui=False)

Reference : Tiago Antao - 2015 - Bioinformatics with Python Cookbook - Packt Publishing Limited - pages 143-149

BoPeng commented 8 years ago

Hi, Florianne,

I can reproduce your problem with python 3 (with some slight change of code style). After I wrap the parent-determination code with

    try:
        for ind in pop.individuals():
            if ind.sex() == 1:
                #choice of the male according to their fecundity
                fathers.extend([ind] * pop.dvars().male_age_fecundity[int(ind.age)])
            else:
                #choice of the female according to their fecundity
                mothers.extend([ind] * pop.dvars().female_age_fecundity[int(ind.age)])
    except:
        print(pop.dvars().female_age_fecundity)
        print(pop.dvars().male_age_fecundity)
        print(set(int(x.age) for x in pop.individuals()))
        sys.exit()

I found that the error happens with

[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18}

I can see that you are accessing the 19th element of a list of size 18, which caused an index error and broke the parent selection process.

So it appears to me that you should change

                mothers.extend([ind] * pop.dvars().female_age_fecundity[int(ind.age)])

and the father part to

                mothers.extend([ind] * pop.dvars().female_age_fecundity[int(ind.age)-1])
fmarande commented 8 years ago

Hello Bo, thank you for this quick answer and for your help !!

Here's the new choose_parent function as you suggest :

def choose_parents(pop):
    #name convention required
    fathers = []
    mothers = []

    for ind in pop.individuals():
        if ind.sex() == 1:
            #choice of the male according to their fecundity
            fathers.extend([ind] * pop.dvars().male_age_fecundity[int(ind.age)-1])
        else:
            #choice of the female according to their fecundity
            mothers.extend([ind] * pop.dvars().female_age_fecundity[int(ind.age)-1])
    while True:
        #random choice of parents in the parents pool
        father = random.choice(fathers)
        mother = random.choice(mothers)
        yield father, mother

However I tried to run it and it works sometimes and sometimes a new error appears :

Traceback (most recent call last):

File "", line 155, in gen=num_gens)

ValueError: Function call failed

If you don't have this error, I suppose it can be due to my python settlement ?

Florianne

BoPeng commented 8 years ago

I got

Traceback (most recent call last):
  File "test.py", line 35, in kill
    cut = pop.dvars().survival_female[int(i.age)]
IndexError: list index out of range

so it looks like survival_female is also shorter than i.age? Please check carefully about such things. The try...except approach I used can be really helpful in such cases.

fmarande commented 8 years ago

Thanks, I will try to track all the dimension things. I don't find the same error as you but I will continue digging this way. Thanks you for answer !

But what I really don't understand is that the issue is related to "index out if range". It should never run and it should not run sometimes and sometimes not run ?

BoPeng commented 8 years ago

The whole evolutionary process is random. You have a small population size, so the probability that some parent having the extreme age (18) is chosen for reproduction is rather low. This is why the IndexError only happens from time to time. If you significantly increase the size of population, such error will almost surely occur each time.

fmarande commented 8 years ago

Ok, now I understand better what's the matter is. I will try to check all the index length. Thank you for all your help ! It was very usefull and I hope I will succeed to fix it !