popsim-consortium / stdpopsim

A library of standard population genetic models
GNU General Public License v3.0
122 stars 86 forks source link

add selfing rate #857

Open petrelharp opened 3 years ago

petrelharp commented 3 years ago

Many species are partial selfers; for instance C. elegans (see #834) has a selfing rate of about 99.9%. SLiM has a "selfing rate" option that we can use directly; for msprime we'd need to rescale things (see for instance Nordborg & Donnelly). So, maybe we need to

Note it's a little tricky to "use this information" in an engine because for instance selfing rate s multiplies Ne by (2-s)/2; this suggests we'd want to simulate with msprime using Ne = population_size * (2-selfing_rate)/2... however, estimates of Ne are probably based on genetic diversity and so probably already have this factor included.

grahamgower commented 3 years ago

FYI, demes includes a selfing_rate (and a cloning_rate).

grahamgower commented 3 years ago

This is a duplicate of #715. @petrelharp, I'm not sure which issue you'd prefer to keep open?

petrelharp commented 3 years ago

I'l close the other, but quote @pblischak from it here:

I think the main issue for a coalescent simulator and selfing is that it can't capture the increase in homozygosity within individuals if chromosomes, rather than individuals, are randomly sampled from the population. Both theta and rho can be scaled by 1+F like Ryan mentioned but the non-random sampling is what may cause the bigger issue. People have typically dealt with this by only sampling one allele per SNP (or some version of this), effectively "haploidizing" their data
petrelharp commented 2 years ago

Discussed today - implementing selfing means that we also need to figure out what the demographic models inferred for selfing species actually mean. The current proposal is (very roughly) - if we turned on selfing we should also multiply all the population sizes by 2, for demographic inferences that don't explicitly include selfing?