xiaoming-liu / stairway-plot-v2

The stairway plot is a method for inferring detailed population demographic history using the site frequency spectrum (SFS) from DNA sequence data.
Other
31 stars 4 forks source link

Which values to use from realSFS folded spectrum? #18

Open mrescalona opened 3 months ago

mrescalona commented 3 months ago

I'm using angsd's realSFS -fold 1 to create a folded SFS from my VCF file with 25 diploid samples.

The function returns 51 bins ranging from 0-50 alleles, with all values from 0-25 being non-zero and the remainder being zeroes. In a unfolded spectrum, I'd remove the first and last bins (monomorphic) and have 2n-1 (assuming all 51 bins are non-zero).

For the folded SFS, I have n/2+1 bins, ranging from 0 to 25. Which values should I keep, in this case?

Here is the folded SFS provided by angsd 22.779504 5737.983351 2271.759531 890.704561 602.396447 453.041958 378.123061 277.515939 310.205540 177.239056 232.210910 212.339813 174.853681 142.583014 126.940380 193.427710 103.595297 161.210691 113.174401 180.349763 112.341701 113.933613 110.531206 104.876343 124.583636 44.298895 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000