szpiech / selscan

Haplotype based scans for selection
GNU General Public License v3.0
107 stars 33 forks source link

allele frequency and ihs value #71

Closed sadiexiaoyu closed 2 years ago

sadiexiaoyu commented 2 years ago

Hi, I am using selscan run ihs on genome data. And I am confusing about the output results of selscan. For example, the following output indicates that the locus (4:1803271) is under positive selection. Since the value is negative (-2.15776), it means that derived allele is under positive selection. But why the frequency of the derived allele is only 0.122186? I suppose that the frequency of the allele under positive selection should be high in the test population, no? Could you give me any suggestion? Thanks!

4:1801542 1801542 0.350482 4684.5 18136 -0.587879 -2.61532 1 4:1801697 1801697 0.350482 4684.5 18136 -0.587879 -2.61532 1 4:1801866 1801866 0.0514469 33564.6 8902.33 0.576378 -1.43333 0 4:1801894 1801894 0.114148 17033.5 10264.3 0.219976 -1.89608 0 4:1802211 1802211 0.0514469 33564.6 8850.4 0.578918 -1.42421 0 4:1802278 1802278 0.414791 2937.56 21703.8 -0.868548 -3.09933 1 4:1802319 1802319 0.0514469 93138.2 9208.57 1.00494 0.104621 0 4:1803236 1803236 0.0836013 37958.5 9776.42 0.589129 -0.948235 0 4:1803251 1803251 0.360129 3590.78 18739.4 -0.717566 -2.96716 1 4:1803271 1803271 0.122186 13587.2 10457.2 0.113715 -2.15776 1 4:1803307 1803307 0.461415 2165.89 25023.2 -1.06271 -3.38237 1 4:1803704 1803704 0.400322 3292.93 22161.5 -0.828018 -2.99851 1 4:1803824 1803824 0.409968 3064.04 23174.6 -0.878717 -3.15625 1 4:1803926 1803926 0.0836013 38106.4 10743.9 0.549838 -1.08975 0 4:1803970 1803970 0.175241 7499.24 13212.8 -0.245978 -2.93009 1

szpiech commented 2 years ago

Hi there,

selscan's implementation of iHS is slightly different from Voight et al (2006), in that selscan reports log(IHH1/iHH0) instead of log(iHH0/iHH1). The only practical difference is that the interpretation of the sign switches. When iHS > 2 this indicates long high frequency haplotypes attached to the derived allele ('1'-allele) at the query location, and when iHS < 2 this indicates long high frequency haplotypes attached to the ancestral allele ('0'-allele) at the query location. When there are clusters of iHS extreme iHS scores, this suggests evidence of positive selection.

Hope this helps,

Zachary

sadiexiaoyu commented 2 years ago

Thank you for the reply! And I would like to ask another question which might not be special to selscan, but it will be nice if you could give any clue. I noticed that sometimes, when the locus is under positive selection (clusters of iHS with extreme iHS scores), the corresponding SNP frequency is not very high (e.g., around 0.5). How could explain this phenomenon? e.g., under balancing selection?

Looking forward to your reply! Thanks in advance!

szpiech commented 2 years ago

Well, it is conceivable that it is a partial sweep in that location. Athough iHS doesn't have great power to detect a sweep at 0.5 freq, it isn't impossible. Another possibility is that this is a site hitchhiking on a sweep happening at an adjacent location, but it may be that only some of the haplotypes sweeping contain that variant.

sadiexiaoyu commented 2 years ago

Thank you for the suggestions!