szpiech / selscan

Haplotype based scans for selection
GNU General Public License v3.0
109 stars 33 forks source link

questions about normalized xpnsl output #58

Closed biozzq closed 3 years ago

biozzq commented 3 years ago

Dear @szpiech

Regarading the following two windows, the fraction scores gt/lt threshold are all 0, however these two windows are in the top5 results of the ref population. In addition, the absolute of the min scores for these two windows are all lt 2. I normalized all chromosomes jointly (norm --xpnsl --bp-win --winsize 20000 --files *.out), I do not why these two windows can be treated as the top5 results in ref population. Thanks for your help in advance.

<# scores in win> ``` 1 20001 171 0 0 100 5 1.57122 -0.640773 20001 40001 282 0 0 100 5 0.930531 -1.13342 ``` The log output also attached here, ``` Total loci: 60557643 num mean variance 60504962 -0.241009 0.0543763 112870 windows with nSNPs >= 10. High Scores nSNPs 1.0 5.0 301 0.995025 0.430535 358 0.83499 0.249717 408 0.686735 0.2064 457 0.587959 0.168955 511 0.55085 0.183081 571 0.520534 0.17089 637 0.401172 0.136431 714 0.351632 0.114526 813 0.322162 0.0932507 1707 0.253152 0.062424 Low Scores nSNPs 1.0 5.0 301 0.0849607 0 358 0.124551 0.00306748 408 0.134479 0.00778817 457 0.172407 0.0168859 511 0.21275 0.031341 571 0.257647 0.053631 637 0.29457 0.0866893 714 0.329173 0.120557 813 0.413136 0.163197 1707 0.529344 0.247331 ``` Sincerely, Zheng zhuqing
szpiech commented 3 years ago

Hi there,

Based on the log output, it looks like over 95% of the windows with <=301 scores had 0 xpnsl scores <-2.

Low Scores
nSNPs 1.0 5.0
301 0.0849607 0

Unfortunately when this is the case, 95% of scores are less than or equal to 0 and 0 gets chosen as the threshold for the top 5% of windows with high proportions of negative scores in the 301 bin. Clearly these regions aren't particularly interesting given that ALL windows in the 301 bin would be classified as in the top 5%. I would probably increase your window size in this case, since on the flip side there seem to be quite a lot of windows in the 301 bin with lots of xnpsl scores >2.

High Scores
nSNPs 1.0 5.0
301 0.995025 0.430535
biozzq commented 3 years ago

Hi @szpiech

Thank you.

Best,