szpiech / selscan

Haplotype based scans for selection
GNU General Public License v3.0
109 stars 33 forks source link

ihh12 normalization #29

Open krostifangers opened 6 years ago

krostifangers commented 6 years ago

Hi I tried to normalize ihh12 files but the distribution i finally obtained was not zero-centered and did not look like normal. So I do not know what's going on.Could you help me? Thank you in advance All the best christophe

szpiech commented 6 years ago

Is it possible to give the command you used for norm and possibly even one of your output files for me to look at?

krostifangers commented 6 years ago

Hi thank you for your answer. I used the following command line ./norm --ihh12 --files ihh12Wild*.out --crit-percent 0.001 this kind of command worked for all other statistics (XP-EHH, IHS, nSL) and please find here a output file thank you in advance All the best christophe

De: "Zachary A Szpiech" notifications@github.com À: "szpiech/selscan" selscan@noreply.github.com Cc: "Christophe LEMAIRE" christophe.lemaire@univ-angers.fr, "Author" author@noreply.github.com Envoyé: Mardi 24 Avril 2018 15:27:35 Objet: Re: [szpiech/selscan] ihh12 normalization (#29)

Is it possible to give the command you used for norm and possibly even one of your output files for me to look at?

— You are receiving this because you authored the thread. Reply to this email directly, [ https://github.com/szpiech/selscan/issues/29#issuecomment-384099852 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/ASn5znwMfMWdsFSh_YraZUph7fO_OzVcks5tr6bXgaJpZM4Tigc- | mute the thread ] .

szpiech commented 6 years ago

Thanks. I'm not sure what happened to the file (I don't see it attached), but I will do some testing and look into this very soon.

krostifangers commented 6 years ago

I re-send the output file

De: "Zachary A Szpiech" notifications@github.com À: "szpiech/selscan" selscan@noreply.github.com Cc: "Christophe LEMAIRE" christophe.lemaire@univ-angers.fr, "Author" author@noreply.github.com Envoyé: Mardi 24 Avril 2018 15:40:04 Objet: Re: [szpiech/selscan] ihh12 normalization (#29)

Thanks. I'm not sure what happened to the file (I don't see it attached), but I will do some testing and look into this very soon.

— You are receiving this because you authored the thread. Reply to this email directly, [ https://github.com/szpiech/selscan/issues/29#issuecomment-384102338 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/ASn5zrgP_3VtYJm1_pv7vwUktAftYIUuks5tr6nEgaJpZM4Tigc- | mute the thread ] .

szpiech commented 6 years ago

Sorry, would you email it to me directly at zachary.szpiech@ucsf.edu. I'm either badly missing something or github isn't letting it through.

CSGallagher commented 3 years ago

I'm curious if this issue was ever resolved? I seem to be observing a slightly off Gaussian distribution in the ihh12 statistic after performing normalization. Please see image at bottom and code used to calculate and normalize ihh12. Thanks so much!

Code for selection scans: selscan --ihh12 --pmap --threads ${sscnThreads} --vcf ${chrVcf} --map ${chrMap} --out ${chrIhh12} Codes for normalization: norm --ihh12 --files ${dir}chr1${fl} ${dir}chr2${fl} ${dir}chr3${fl} ${dir}chr4${fl} ${dir}chr5${fl} ${dir}chr6${fl} ${dir}chr7${fl} ${dir}chr8${fl} ${dir}chr9${fl} ${dir}chr10${fl} ${dir}chr11${fl} ${dir}chr12${fl} ${dir}chr13${fl} ${dir}chr14${fl} ${dir}chr15${fl} ${dir}chr16${fl} ${dir}chr17${fl} ${dir}chr18${fl} ${dir}chr19${fl} ${dir}chr20${fl} ${dir}chr21${fl} ${dir}chr22${fl}

image

szpiech commented 3 years ago

Hi there, this may be due to the fact that raw ihh12 is theoretically bound in [0,1] (unlike iHS, et al which are in (-\infty,\infty), but I'm not too sure at the moment.

willright28 commented 2 years ago

Hi all,

I would like to know the header information of the ihh12 norm window output, it has 5 columns such as: 1 50001 116 0.422414 100 I can tell the first three are window start, end, and SNP numbers. Then the last two are <frac of |iHS| > threshold>

, right? Based on the answear here #73 . So if I want to compare the ihh12 value between windows(or chr) or display the ihh12 value across genome, could I use the 4th column in the result of norm window output? Thanks!
szpiech commented 2 years ago

Hello,

So the columns are,

The final column is based on the fraction of extreme scores, so higher fractions will tend to have smaller percentiles. In reality the final column only really takes 4 values, 1, 5, 100, and -1, corresponding to top 1%, top 5%, top 100%, and no data. Your putative sweep regions are then identified by selecting the 1% or 1% and 5% indications. There are a number of ways to compare between windows. Fraction of scores is perhaps not the best, because in order to compute the top X% of windows we first bin windows by num scores and then compute the percentile within each bin. You might take the max score or the mean score if you wish to compare among windows. Hope this helps, Zachary On Wed, Jul 27, 2022 at 4:59 AM willright28 ***@***.***> wrote: > Hi all, > > I would like to know the header information of the ihh12 norm window > output, it has 5 columns such as: > 1 50001 116 0.422414 100 > I can tell the first three are window start, end, and SNP numbers. Then > the last two are threshold> > , right? Based on the answear here #73 > . > > So if I want to compare the ihh12 value between windows(or chr) or display > the ihh12 value across genome, could I use the 4th column in the result of > norm window output? > > Thanks! > > — > Reply to this email directly, view it on GitHub > , > or unsubscribe > > . > You are receiving this because you commented.Message ID: > ***@***.***> >