tskit-dev / msprime-1.0-paper

Publication describing msprime 1.0
4 stars 20 forks source link

Recomb rate for discoal plot #214

Closed jeromekelleher closed 2 years ago

jeromekelleher commented 2 years ago

Why is an order of magnitude lower recombination rate used here than in other examples? This choice seems arbitrary and makes it difficult to relate these benchmarks results to any others in the paper. Samples are diploid or haploid? I would suggest the absolute Ne and s rates be included in the legend, in addition to the scaled rate.

It's probably simplest to just rerun this with recombination rate = 10^-8 and a sequence length 10x shorter I think? @andrewkern?

jeromekelleher commented 2 years ago

Although I see refsize= 1e6, so that makes things less comparable with the rest of the paper as well (Ne=10^4, r=1e-8). Is there any particular reason for these parameter values @andrewkern? What do you suggest doing here?

https://github.com/tskit-dev/msprime-1.0-paper/blob/b872809017835085c5643c912d760cecbf6d9a72/evaluation/generate_sweeps_perf_data.py#L155

jeromekelleher commented 2 years ago

Also, if we're redoing this figure could we do it in such a way that we report the time for one replicate rather than 100? See Reviewer 1:

And based on the descriptions sometimes replicates are summed over (Figure 7) or averaged over (Figure 8?) to determine the values on the y-axis.

We can explain that it's tricky finding parameter combos where discoal will run in a reasonable amount of time, but it must be possible if we squeeze the sequence length space enough?

andrewkern commented 2 years ago

sure, i can make these changes. no problem.

andrewkern commented 2 years ago

@jeromekelleher is there a more up to date Makefile that didn't get committed here?

jeromekelleher commented 2 years ago

Makefile should be fully up to date, but I'm not sure if it contained all the linkage for making the data (I probably didn't bother with some of that and just ran then scripts).

andrewkern commented 2 years ago

👍