szpiech / selscan

Haplotype based scans for selection
GNU General Public License v3.0
107 stars 33 forks source link

--winSize #73

Closed Daluser1 closed 2 years ago

Daluser1 commented 2 years ago

Hi Dr. Szpiech,

I have normalized |iHS| values for each SNP using Norm script. However, I am not sure how can I calculate it for non-overlapping sliding windows of 50 kb?

I sincerely appreciate your time and consideration.

szpiech commented 2 years ago

Hi,

Does the flag not work? Did you use —bp-win flag too?

-Zachary

Le mer. 16 mars 2022 à 9:23 AM, Daluser1 @.***> a écrit :

Hi Dr. Szpiech,

I have normalized |iHS| values for each SNP using Norm script. However, I am not sure how can I calculate it for non-overlapping sliding windows of 50 kb?

I sincerely appreciate your time and consideration.

— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/73, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQRPKAWPLJRSZDPNROLVAHOEVANCNFSM5Q3354QA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

Daluser1 commented 2 years ago

Hi Dr. szpiech,

I am not my command line is correct or not: ./norm --ihs --files "/selscan/bin/linux/iHS_results/iHs_chr_1.ihs.out" --winSize 250000 or ./norm --ihs --files "/selscan/bin/linux/iHS_results/iHs_chr_1.ihs.out" --winSize= 250000

szpiech commented 2 years ago

You will need to include —bp-win flag in addition. Also I believe the flags are case sensitive, so use —winsize not —winSize. No equals sign is needed.

Zachary

Le mer. 16 mars 2022 à 9:39 AM, Daluser1 @.***> a écrit :

Hi Dr. szpiech,

I am not my command line is correct or not: ./norm --ihs --files "/selscan/bin/linux/iHS_results/iHs_chr_1.ihs.out" --winSize 250000 or ./norm --ihs --files "/scratch/vshafagh/Variant_files/selscan/bin/linux/iHS_results/iHs_chr_1.ihs.out" --winSize= 250000

— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/73#issuecomment-1069141871, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQQJ2W6JBRMBPGMVUBTVAHP7RANCNFSM5Q3354QA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

Daluser1 commented 2 years ago

Dear Prof. Szpiech,

Thank you very much for your reply. I really appreciate it!

However, I have a bit problem understanding the results. I have these columns in the window output file: 1 100001 312 0.0320513 100 2 100001 200001 651 0.0184332 100 2 200001 300001 491 0.0162933 100 2 300001 400001 672 0 100 1 400001 500001 1184 0.00929054 100 2 500001 600001 973 0.0174717 100 2 600001 700001 839 0.0619785 100 3 700001 800001 1050 0.0609524 100 3 800001 900001 937 0.0320171 100 3 900001 1000001 866 0.0311778 100 3 1000001 1100001 1019 0.0176644 100 2 I know that the first two columns are the start and the end position of the window but I can not understand the rest of the columns in the out put file. I sincerely appreciate it if you could advise me on it.

szpiech commented 2 years ago

Hi,

The columns are

threshold> by default the threshold referenced in column 4 is 2. There is currently a bug where the max score in window (column 6) is reported as an integer (so the decimal is truncated). -Zachary On Wed, Mar 16, 2022 at 11:18 AM Daluser1 ***@***.***> wrote: > Dear Prof. Szpiech, > > Thank you very much for your reply. I really appreciate it! > > However, I have a bit problem understanding the results. I have these > columns in the window output file: > 1 100001 312 0.0320513 100 2 > 100001 200001 651 0.0184332 100 2 > 200001 300001 491 0.0162933 100 2 > 300001 400001 672 0 100 1 > 400001 500001 1184 0.00929054 100 2 > 500001 600001 973 0.0174717 100 2 > 600001 700001 839 0.0619785 100 3 > 700001 800001 1050 0.0609524 100 3 > 800001 900001 937 0.0320171 100 3 > 900001 1000001 866 0.0311778 100 3 > 1000001 1100001 1019 0.0176644 100 2 > I know that the first two columns are the start and the end position of > the window but I can not understand the rest of the columns in the out put > file. I sincerely appreciate it if you could advise me on it. > > — > Reply to this email directly, view it on GitHub > , > or unsubscribe > > . > Triage notifications on the go with GitHub Mobile for iOS > > or Android > . > > You are receiving this because you commented.Message ID: > ***@***.***> >
Daluser1 commented 2 years ago

Thank you so much!

Daluser1 commented 2 years ago

Dear Prof. Szpiech,

After normalizing XP-EHH and iHS using norm script, the normxpehh or the absolute value should be used for plotting and for final report? Thanks In Advance!

szpiech commented 2 years ago

Hi,

For iHS, generally absolute value should be used. For XP-EHH, the negative vs positive scores have an important meaning that should be preserved: negative scores indicate long high frequency haplotypes in the "reference" population (e.g. population given with --vcf-ref) relative to the other population, and positive scores indicate long high frequency haplotypes in the first population (e.g. given with --vcf) relative to the reference.

-Zachary

On Thu, Mar 17, 2022 at 9:37 PM Daluser1 @.***> wrote:

Dear Prof. Szpiech,

After normalizing XP-EHH and iHS using norm script, the normxpehh or the absolute value should be used for plotting and for final report? Thanks In Advance!

— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/73#issuecomment-1071936553, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQWMRM2UMAEF26RL6F3VAPM4TANCNFSM5Q3354QA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

Daluser1 commented 2 years ago

Hi Prof. Szpiech,

Thank you very much for you reply and very helpful information. I sincerely appreciate it! But, after normalization I should use the "normxpehh" values not the "xpehh", right?

Thank you very much! Best Regards

szpiech commented 2 years ago

For xpehh, I prefer the normalized scores, although it isn’t strictly necessary. For iHS, normalization (in allele frequency bins) is necessary, since without normalization raw iHS is correlated with allele frequency (xpehh doesn’t have this problem).

Zachary

Le ven. 18 mars 2022 à 11:16 AM, Daluser1 @.***> a écrit :

Hi Prof. Szpiech,

Thank you very much for you reply and very helpful information. I sincerely appreciate it! But, after normalization I should use the "normxpehh" values not the "xpehh", right?

Thank you very much! Best Regards

— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/73#issuecomment-1072510297, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQSPQLRIU5FUDTXZNELVASM5NANCNFSM5Q3354QA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

Daluser1 commented 2 years ago

Thank you so so much for your reply.