tskit-dev / msprime

Simulate genealogical trees and genomic sequence data using population genetic models
GNU General Public License v3.0
173 stars 85 forks source link

Interaction between "population size" and "recombination rate" missing? #2025

Open molpopgen opened 2 years ago

molpopgen commented 2 years ago

(This is related to #2024)

The docs for recombination_rate under sim_ancestry state:

See the [Recombination](https://tskit.dev/msprime/docs/stable/ancestry.html#sec-ancestry-recombination) section for usage examples for this parameter and how it interacts with other parameters.

However, the Recombination section does not discuss the interaction with other parameters. Specifically, population_size, which makes a big difference in run times, etc., due to its effect on the scaled rates:

>>> import msprime
>>> for N in [1, 100, 1000]:
...     x = msprime.sim_ancestry(40, sequence_length=50000000, recombination_rate=1e-8, population_size=N)
...     print(x.num_trees)
... 
9
847
8782
benjeffery commented 2 years ago

I thought this might have been lost in the big doc re-org that happened for 1.0, but I can't see such a mention in the old 0.7 docs either. Would you mind drafting something @molpopgen?

molpopgen commented 2 years ago

Sure. I'll try to get to it this week.

molpopgen commented 2 years ago

There are some unresolved comments over in #2026, so I'll be punting on this for a bit.