Closed cistarsa closed 3 years ago
Wow, interesting. I haven't had a chance to explore the behavior of m generally (i.e., not just in regions with high support for a distorted HFS, very large T). I might guess that is it picking up something going on with the demographic history in pop 2. Does it differ substantially from pop 1 post-dvergence?
Indeed it does! These populations have pretty different Ne, heterozygosity, and diverged ~300 generations ago. Perhaps these are artifacts from a bottleneck or founder? Although it could be biologically consistent I guess I'm still surprised to see so so many more in pop 2. Maybe ill run saltilassi
Also wondering how it handles missing data?
At the moment, the software will drop any haplotype that has missing data for that window's HFS computation. With a large sample size, this shouldn't matter much, but if you have a smaller sample size with lots of missing data it may be worth filtering those loci (or imputing) until I come up with a better way to handle it.
One thing to keep in mind is that as the window size grows larger (when there is missing data present) the more likely you are to encounter at least one missing allele on each haplotype in that window, and then the computation will fail in that window for lack of data.
good to keep in mind, thank you for the quick and helpful feedback!
hello and thanks for running this nifty program! I'm trying to compare these results with RAiSD (hard selective sweep) on two divergent but related populations (n=25 & 28) but for one of these I'm getting a huge amount of "1" value haplotypes.