Open spchavan10 opened 5 months ago
Hello,
This could potentially happen if there are <20 sites within a given frequency bin. Normalization within a given frequency bin occurs only if there are >=20 scores in that bin. If this is the case for you, you can try reducing the number of frequency bins you are using.
-Zachary
On Mon, Apr 29, 2024 at 1:07 AM spchavan10 @.***> wrote:
Hello there, I'm an Animal Genetics student currently working on Bovine50kSNP genotype data. Why, after using norm for the .ihs.out file, its not giving standardized iHS scores for all the SNPs in the file? Suppose in the chr1.ihs.out file there are 1894 SNPs, but the chr1.ihs.out.100bins shows scores for only 1560 SNPs.
— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/110, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQTZ7CQNU5GG4CGX4S3Y7XIP3AVCNFSM6AAAAABG5UG6EOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3DQMJTGIZDAOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thank you for the insights, but even if I reduce the frequency bins, the result stays exactly the same. For the calculation of iHS in Selscan, I used flags like --max-gap, --gap-scale, --pmap, --maf, --trunc-ok, and --cutoff to retrieve the maximum number of SNPs in the unstandardized output file. Actually, I want to incorporate these scores into DCMS by converting them into p-values. That is why I'm trying to get all the SNPs in the output file.
Where can I get the manual for the NORM function?
Hello,
Unfortunately, I think the only documentation I’ve written so far for norm is in the changelog and in —help. This will need to change.
Can you send your norm log file? I'll try to trouble shoot, but I have limited time before I go on leave at the end of the week.
Zachary
Le mar. 30 avr. 2024 à 5:41 AM, Shambhuraditya Chavan < @.***> a écrit :
Where can I get the manual for the NORM function?
— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/110#issuecomment-2084845891, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQXUZL3OGEGYUTIW5MLY75ROLAVCNFSM6AAAAABG5UG6EOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBUHA2DKOBZGE . You are receiving this because you commented.Message ID: @.***>
./norm --ihs --files chromo1.ihs.out norm v1.3.0 You have provided 1 output files for joint normalization. Opened chromo1.ihs.out
Total loci: 1894 Reading all data. Calculating mean and variance per frequency bin:
bin num mean variance 0.01 0 -nan -nan 0.02 201 -0.942456 0.0331806 0.03 0 -nan -nan 0.04 153 0.202392 0.116415 0.05 0 -nan -nan 0.06 120 0.151216 0.0700112 0.07 0 -nan -nan 0.08 107 0.169515 0.0389571 0.09 0 -nan -nan 0.1 113 0.228969 0.0594271 0.11 0 -nan -nan 0.12 88 0.197439 0.0371951 0.13 77 0.155727 0.0376023 0.14 0 -nan -nan 0.15 62 0.121447 0.0296197 0.16 0 -nan -nan 0.17 58 0.108081 0.0347496 0.18 0 -nan -nan 0.19 49 0.153233 0.0225802 0.2 0 -nan -nan 0.21 62 0.0996912 0.0190187 0.22 0 -nan -nan 0.23 43 0.108986 0.0256443 0.24 0 -nan -nan 0.25 41 0.085957 0.0129141 0.26 31 0.0990113 0.0119576 0.27 0 -nan -nan 0.28 25 0.0350351 0.0224791 0.29 0 -nan -nan 0.3 17 0.0756617 0.0147399 0.31 0 -nan -nan 0.32 25 0.0648255 0.0219701 0.33 0 -nan -nan 0.34 17 0.0989446 0.0155861 0.35 0 -nan -nan 0.36 20 0.0618536 0.0244242 0.37 0 -nan -nan 0.38 16 0.0100937 0.0110296 0.39 11 0.034661 0.0255826 0.4 0 -nan -nan 0.41 20 0.0504288 0.0104644 0.42 0 -nan -nan 0.43 17 0.0261363 0.0178654 0.44 0 -nan -nan 0.45 16 -0.104928 0.0097222 0.46 0 -nan -nan 0.47 11 -0.0363621 0.0161006 0.48 0 -nan -nan 0.49 20 -0.011081 0.0213475 0.5 0 -nan -nan 0.51 13 -0.00641709 0.0121569 0.52 17 0.0351954 0.0321013 0.53 0 -nan -nan 0.54 24 0.0141019 0.0218387 0.55 0 -nan -nan 0.56 35 -0.0504989 0.0306169 0.57 0 -nan -nan 0.58 25 -0.041707 0.0193195 0.59 0 -nan -nan 0.6 11 -0.0530253 0.019896 0.61 0 -nan -nan 0.62 18 -0.0725664 0.0128914 0.63 14 -0.0723302 0.0110694 0.64 0 -nan -nan 0.65 16 -0.0353695 0.0145087 0.66 0 -nan -nan 0.67 25 -0.0646973 0.0264331 0.68 0 -nan -nan 0.69 16 -0.0806662 0.0161506 0.7 0 -nan -nan 0.71 23 -0.0510331 0.017164 0.72 0 -nan -nan 0.73 17 -0.110549 0.0167697 0.74 0 -nan -nan 0.75 21 -0.111374 0.0296077 0.76 20 -0.111592 0.0239563 0.77 0 -nan -nan 0.78 18 -0.226333 0.0147349 0.79 0 -nan -nan 0.8 21 -0.141272 0.0764032 0.81 0 -nan -nan 0.82 20 -0.166597 0.0491221 0.83 0 -nan -nan 0.84 15 -0.197985 0.06345 0.85 0 -nan -nan 0.86 13 -0.140425 0.030714 0.87 0 -nan -nan 0.88 30 -0.239987 0.029849 0.89 12 -0.186248 0.027718 0.9 0 -nan -nan 0.91 19 -0.183821 0.0668138 0.92 0 -nan -nan 0.93 13 -0.140662 0.0534552 0.94 0 -nan -nan 0.95 10 -0.238264 0.0849062 0.96 0 -nan -nan 0.97 3 -0.186794 0.0100847 0.98 0 -nan -nan 0.99 5 0.902356 0.0553932 1 0 -nan -nan Normalizing chromo1.ihs.out
I'll mail map, vcf, ihs and norm output files.
As mentioned earlier, I'll be using these scores for DCMS estimation. That's why I'm trying to retrieve all the SNPs.
On Tue, 30 Apr 2024 at 16:34, Zachary A Szpiech @.***> wrote:
Hello,
Unfortunately, I think the only documentation I’ve written so far for norm is in the changelog and in —help. This will need to change.
Can you send your norm log file? I'll try to trouble shoot, but I have limited time before I go on leave at the end of the week.
Zachary
Le mar. 30 avr. 2024 à 5:41 AM, Shambhuraditya Chavan < @.***> a écrit :
Where can I get the manual for the NORM function?
— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/110#issuecomment-2084845891,
or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABAKRQXUZL3OGEGYUTIW5MLY75ROLAVCNFSM6AAAAABG5UG6EOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBUHA2DKOBZGE>
. You are receiving this because you commented.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/110#issuecomment-2085003182, or unsubscribe https://github.com/notifications/unsubscribe-auth/BIEYXFL6WTHY77CPYGVP4RTY753EBAVCNFSM6AAAAABG5UG6EOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBVGAYDGMJYGI . You are receiving this because you authored the thread.Message ID: @.***>
Hello,
Ok, this shows me that there are many frequency classes with < 20 sites (second column of that freq/mean/var table), but you are only normalizing one chromosome? Recommended usage would be to provide all chromosomes at once for joint normalization. Eg —files chromo*.ihs.out
Zachary
Le mar. 30 avr. 2024 à 7:28 AM, Shambhuraditya Chavan < @.***> a écrit :
./norm --ihs --files chromo1.ihs.out norm v1.3.0 You have provided 1 output files for joint normalization. Opened chromo1.ihs.out
Total loci: 1894 Reading all data. Calculating mean and variance per frequency bin:
bin num mean variance 0.01 0 -nan -nan 0.02 201 -0.942456 0.0331806 0.03 0 -nan -nan 0.04 153 0.202392 0.116415 0.05 0 -nan -nan 0.06 120 0.151216 0.0700112 0.07 0 -nan -nan 0.08 107 0.169515 0.0389571 0.09 0 -nan -nan 0.1 113 0.228969 0.0594271 0.11 0 -nan -nan 0.12 88 0.197439 0.0371951 0.13 77 0.155727 0.0376023 0.14 0 -nan -nan 0.15 62 0.121447 0.0296197 0.16 0 -nan -nan 0.17 58 0.108081 0.0347496 0.18 0 -nan -nan 0.19 49 0.153233 0.0225802 0.2 0 -nan -nan 0.21 62 0.0996912 0.0190187 0.22 0 -nan -nan 0.23 43 0.108986 0.0256443 0.24 0 -nan -nan 0.25 41 0.085957 0.0129141 0.26 31 0.0990113 0.0119576 0.27 0 -nan -nan 0.28 25 0.0350351 0.0224791 0.29 0 -nan -nan 0.3 17 0.0756617 0.0147399 0.31 0 -nan -nan 0.32 25 0.0648255 0.0219701 0.33 0 -nan -nan 0.34 17 0.0989446 0.0155861 0.35 0 -nan -nan 0.36 20 0.0618536 0.0244242 0.37 0 -nan -nan 0.38 16 0.0100937 0.0110296 0.39 11 0.034661 0.0255826 0.4 0 -nan -nan 0.41 20 0.0504288 0.0104644 0.42 0 -nan -nan 0.43 17 0.0261363 0.0178654 0.44 0 -nan -nan 0.45 16 -0.104928 0.0097222 0.46 0 -nan -nan 0.47 11 -0.0363621 0.0161006 0.48 0 -nan -nan 0.49 20 -0.011081 0.0213475 0.5 0 -nan -nan 0.51 13 -0.00641709 0.0121569 0.52 17 0.0351954 0.0321013 0.53 0 -nan -nan 0.54 24 0.0141019 0.0218387 0.55 0 -nan -nan 0.56 35 -0.0504989 0.0306169 0.57 0 -nan -nan 0.58 25 -0.041707 0.0193195 0.59 0 -nan -nan 0.6 11 -0.0530253 0.019896 0.61 0 -nan -nan 0.62 18 -0.0725664 0.0128914 0.63 14 -0.0723302 0.0110694 0.64 0 -nan -nan 0.65 16 -0.0353695 0.0145087 0.66 0 -nan -nan 0.67 25 -0.0646973 0.0264331 0.68 0 -nan -nan 0.69 16 -0.0806662 0.0161506 0.7 0 -nan -nan 0.71 23 -0.0510331 0.017164 0.72 0 -nan -nan 0.73 17 -0.110549 0.0167697 0.74 0 -nan -nan 0.75 21 -0.111374 0.0296077 0.76 20 -0.111592 0.0239563 0.77 0 -nan -nan 0.78 18 -0.226333 0.0147349 0.79 0 -nan -nan 0.8 21 -0.141272 0.0764032 0.81 0 -nan -nan 0.82 20 -0.166597 0.0491221 0.83 0 -nan -nan 0.84 15 -0.197985 0.06345 0.85 0 -nan -nan 0.86 13 -0.140425 0.030714 0.87 0 -nan -nan 0.88 30 -0.239987 0.029849 0.89 12 -0.186248 0.027718 0.9 0 -nan -nan 0.91 19 -0.183821 0.0668138 0.92 0 -nan -nan 0.93 13 -0.140662 0.0534552 0.94 0 -nan -nan 0.95 10 -0.238264 0.0849062 0.96 0 -nan -nan 0.97 3 -0.186794 0.0100847 0.98 0 -nan -nan 0.99 5 0.902356 0.0553932 1 0 -nan -nan Normalizing chromo1.ihs.out
— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/110#issuecomment-2085043141, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQVRFGZ4OTCH5XGYG6DY7554ZAVCNFSM6AAAAABG5UG6EOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBVGA2DGMJUGE . You are receiving this because you commented.Message ID: @.***>
Oh, okay, I'll do it. Earlier, I tried using the loop command for this, but the results stayed the same.
Hello there, I'm an Animal Genetics student currently working on Bovine50kSNP genotype data. Why, after using norm for the .ihs.out file, its not giving standardized iHS scores for all the SNPs in the file? Suppose in the chr1.ihs.out file there are 1894 SNPs, but the chr1.ihs.out.100bins shows scores for only 1560 SNPs.