szpiech / selscan

Haplotype based scans for selection
GNU General Public License v3.0
111 stars 33 forks source link

Setting ehh-win and max-extend in XP-EHH calculation #91

Closed AliceIob closed 1 year ago

AliceIob commented 1 year ago

Hi,

I am trying to calculate XP-EHH, and, considering my data, and I expect EHH to decay very slowly with bp-distance from the core SNP. For this reason, I want to constrain as less as possible the calculation of the statistics. I was looking at the manual, and I don't understand how I should combine these 2 options : --ehh-win to define for a single EHH computation the maximum extension in base pairs from the query locus. Default is 100, 000 bp. --max-extend Use --max-extend to set an additional stopping condition for iHS and XP-EHH computations. If the EHH decay curve has extended MAX EXTEND bp away from the core without reaching the ehh decay cutoff, truncate the curve here and integrate. Default is 1, 000, 000 bp; set ≤ 0 for no restriction.

If I understand correctly, even if I set max-extend to no restriction (<=0) for integrating EHH, the size of genomic window in which EHH is computed will still be dependent on ehh-win, is this right? Is there a way to allow for EHH calculation from the core locus without restriction (i.e. up to the bp where EHH reaches 0.05)? Or shall I simply select a very very big number to reach the same effect?

Thank you in advance

szpiech commented 1 year ago

Hello,

You will want to set —max-extend to 0, and this will set no restriction. Keep in mind that this may also result in fewer sites with xpehh scores near chromosome ends and large gaps.

The —ehh-win option is only when calculating with the —ehh flag. Sorry for the confusion!

-Zachary

Le mer. 1 févr. 2023 à 6:58 AM, AliceIob @.***> a écrit :

Hi,

I am trying to calculate XP-EHH, and, considering my data, and I expect EHH to decay very slowly with bp-distance from the core SNP. For this reason, I want to constrain as less as possible the calculation of the statistics. I was looking at the manual, and I don't understand how I should combine these 2 options : --ehh-win to define for a single EHH computation the maximum extension in base pairs from the query locus. Default is 100, 000 bp. --max-extend

Use --max-extend to set an additional stopping condition for iHS and XP-EHH computations. If the EHH decay curve has extended MAX EXTEND bp away from the core without reaching the ehh decay cutoff, truncate the curve here and integrate. Default is 1, 000, 000 bp; set ≤ 0 for no restriction.

If I understand correctly, even if I set max-extend to no restriction (<=0) for integrating EHH, the size of genomic window in which EHH is computed will still be dependent on ehh-win, is this right? Is there a way to allow for EHH calculation from the core locus without restriction (i.e. up to the bp where EHH reaches 0.05)? Or shall I simply select a very very big number to reach the same effect?

Thank you in advance

— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/91, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQTQVETLWDDKSAEDN5DWVJFW5ANCNFSM6AAAAAAUNUDBYQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>