Closed kroluk closed 6 years ago
Could you please show me the result of
snpset <- snpgdsLDpruning(GDSgenofile, method="corr", ld.threshold= 1.0,
maf = 0.05, missing.rate = 0.05)
Numerical calculation could not guarantee the maximum is exactly 1 (, but should be very close to 1).
Hi Xiuwen, thanks for the quick answer. Pleas find the result below.
Setting the threshold to 1.0 still prunes approx. 450 markers. The "border-threshold" is at about 1.000000000000001. So indeed very close to 1.
snpset <- snpgdsLDpruning(GDSgenofile, method="corr", ld.threshold= 1.0, maf = 0.05, missing.rate = 0.05) SNP pruning based on LD: Excluding 0 SNP on non-autosomes Excluding 171 SNPs (monomorphic: TRUE, MAF: 0.05, missing rate: 0.05) Working space: 315 samples, 13,450 SNPs using 1 (CPU) core sliding window: 500,000 basepairs, Inf SNPs |LD| threshold: 1 method: correlation Chromosome 1: 95.46%, 820/859 Chromosome 2: 94.48%, 1,096/1,160 Chromosome 3: 92.53%, 471/509 Chromosome 4: 97.49%, 816/837 Chromosome 5: 97.80%, 889/909 Chromosome 6: 97.58%, 323/331 Chromosome 7: 94.66%, 691/730 Chromosome 8: 96.64%, 892/923 Chromosome 9: 98.45%, 191/194 Chromosome 10: 94.29%, 396/420 Chromosome 11: 95.14%, 470/494 Chromosome 12: 97.85%, 91/93 Chromosome 13: 94.54%, 849/898 Chromosome 14: 92.61%, 1,003/1,083 Chromosome 15: 97.13%, 338/348 Chromosome 16: 94.37%, 771/817 Chromosome 17: 95.69%, 911/952 Chromosome 18: 96.84%, 276/285 Chromosome 19: 95.94%, 875/912 Chromosome 20: 95.83%, 620/647 Chromosome 21: 96.82%, 213/220 13,002 markers are selected in total.
Hi,
I want to prune my SNP data based on maf, %missing and LD using snpgdsLDpruning(). Using ld.threshold= 0.2 removed a great many markers, so I started to play around. I found that if I dont want to remove any SNP based on LD, I have to set ld.threshold = 1.1. I find this strange since in theory LD >1 is not possible - or am I missing something here?
Thanks, Lukas
R output:
Matrix products: default
locale: [1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252 LC_MONETARY=German_Switzerland.1252 [4] LC_NUMERIC=C LC_TIME=German_Switzerland.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] SNPRelate_1.16.0 gdsfmt_1.18.0
loaded via a namespace (and not attached): [1] compiler_3.5.1 tools_3.5.1 rstudioapi_0.8 yaml_2.2.0 crayon_1.3.4