szpiech / selscan

Haplotype based scans for selection
GNU General Public License v3.0
109 stars 33 forks source link

Should the non-polymorphsim variants be removed? #59

Closed biozzq closed 3 years ago

biozzq commented 3 years ago

Hi @szpiech

I am wondering if it is necessary to filter out the non-polymorphsim variants (e.g., all individuals are 0|0 or 1|1|) when using selscan, because I do not know how the number of SNPs was calculated in selscan. Why dose this always happen in my study? I usually do genotype phasing for all my samples jointly, however, I want to identify the selection sweeps in subsamples. Thus, when not filtering by allele frequency for these subsamples, the non-polymorphsim variants will be kept in the input VCF for selscan.

Best wishes, Zheng zhuqing

szpiech commented 3 years ago

Hi there,

So if you are running ihs or nsl scans, selscan will only compute the statistics at sites with MAF >= 0.05 (change with --maf) and by default sites with MAF < 0.05 are filtered (modify with --keep-low-freq).

If you are running XP-EHH or XP-nSL scans, no filtering is done and selscan will use all the sites provided. In this case, I think filtering sites that are monomorphic in the combined data is a good idea (but keep sites that are poly in one and mono in the other).

Hope this helps!

biozzq commented 3 years ago

Dear @szpiech

Thank you.

Best wishes