petrelharp / treestats_ms

1 stars 1 forks source link

plot diversity around a sweep #4

Closed petrelharp closed 5 years ago

petrelharp commented 5 years ago

To make the point that site stats just esimate branch stats, I am thinking to do something like this: simulate some sweeps, and plot both site and branch diversity along the genome. Here's an example: swept 44 1e-09 diversity

Maybe... it has too many sweeps? With fewer we could also put the selection coefficients next to the vertical dotted blue lines. Also, I'm thinking of adding at least one more "site" lines with a lower mutation rate, to show that it's noisier?

petrelharp commented 5 years ago

And, damn this was easy to do!

jeromekelleher commented 5 years ago

LGTM. One thing we might do to make it more concrete is to generate lots of difference replicate nuetral mutations on top of the ts, and show the 95% CI (or something) as a grey band around the branch stats? So, giving the sense that we're getting the mean of a noisy distribution when we use branch stats.

jeromekelleher commented 5 years ago

I haven't seen the figure, but the code looks like it's doing something very nice. Merge away whenever you like.

petrelharp commented 5 years ago

Whoops; I accidentally posted this stuff to the wrong tab, over in https://github.com/tskit-dev/tskit/pull/248

With 1000 individuals: Screenshot from 2019-07-14 09-46-28

and with 10000: Screenshot from 2019-07-14 09-46-17

The 10,000 individual one was taking hours, but it turned out that was entirely due to this stuff. Gotta do that more efficiently.

petrelharp commented 5 years ago

Ok, running these things takes very little time now. Mostly putting this stuff in a PR because it is exciting! Will merge now.