szpiech / selscan

Haplotype based scans for selection
GNU General Public License v3.0
109 stars 33 forks source link

XP-EHH normalization and analyzis of non-overlapping windows #40

Closed miletj closed 4 years ago

miletj commented 4 years ago

Hi,

I used the following command to normalize and analyze XP-EHH results in non-overlapping windows of fixed bp size (100kb) norm --xpehh --files ./Results/XP-EHH/BEN_${p}.chr*.xpehh.out --bp-win --min-snps 20 --winsize 100000 --crit-percent 0.01; I was wondering if i am doing the right thing (I want to identify the windows with the 5% most extreme pourcentage of extreme score), and what this function really do. When I'm looking at the final results, I see some windows with 0% extreme value which are significant at 5%. I see also in the log file that the identification of the most extreme windows seems to be done by bins of numbers of SNP by windows, and that for the bins with low number of SNPs (<40) , the threshold value for a significant signal is 0. Is it normal?

Thanks, Jacqueline

norm_xpehh_win100kb.pdf

szpiech commented 4 years ago

Hi Jacqueline,

By using --crit-percent 0.01 it seems the cutoffs are being drawn at approximately +/-3. This, combined with a low number of SNPs seems to be causing the 5% cutoff to find 0. I'd probably set --qbins 10 since there isn't a lot of point in binning e.g. 35 SNP windows differently from 30 SNP windows, and you might consider relaxing your --crit-percent threshold as well. Typically the default setting for --crit-percent works quite well.

Hope this helps!

miletj commented 4 years ago

Thanks you very much for explanations. It helps. I'm going to revise my criteria.

Jacqueline

Le 28/08/2019 à 18:10, Zachary A Szpiech a écrit :

Hi Jacqueline,

By using --crit-percent 0.01 it seems the cutoffs are being drawn at approximately +/-3. This, combined with a low number of SNPs seems to be causing the 5% cutoff to find 0. I'd probably set --qbins 10 since there isn't a lot of point in binning e.g. 35 SNP windows differently from 30 SNP windows, and you might consider relaxing your --crit-percent threshold as well. Typically the default setting for --crit-percent works quite well.

Hope this helps!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/40?email_source=notifications&email_token=ANAYHFK6K7IM4OJWLEGCOXDQG2PRDA5CNFSM4IRDNPNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5LUYEI#issuecomment-525814801, or mute the thread https://github.com/notifications/unsubscribe-auth/ANAYHFPBBYJ5ALPH6JGKODTQG2PRDANCNFSM4IRDNPNA.