Open xtmgah opened 5 years ago
Thanks very much for the bug report.
The discrepancy in LD estimates appears to have been due to a bug processing unphased genotypes for variants with MAF > 50%.
Also, the "dstats" option is currently only supported for phased genotypes.
I've corrected the bug and added error messages accordingly. If you could confirm that the output looks OK, I'll go ahead and close this out.
Also: To calculate the same r2 statistic using PLINK, use --r2 rather than --ld. Running the PLINK command below should produce output equivalent to emeraLD:
plink-1.9 --vcf test.vcf.gz --r2 --ld-snps rs3780269 rs111207562 --ld-window-r2 0
Hello:
I am trying to compare the result using 1kg unphased between plink and emeraLD, and I got totally different result. Also, I found a lot of LD generated by emeraLD have R vale >1 or <-1; Can you tell me the difference of LD calculation between plink and emeraLD? Here is one example; Thanks.
$emeraLD -i test.vcf.gz --stdout --dstats --no-phase emeraLD v0.1 (c) 2018 corbin quick (corbinq@gmail.com)
reading from vcf file...
assuming unphased data (reporting diploid genotype LD)...
processed genotype data for 503 individuals...
calculating LD for 2 SNPs...
CHR POS1 POS2 R Rsq D Dprime
9 133830619 134457580 -15.71375 246.92204 -0.66269 -99.00000 done!! thanks for using emeraLD
plink1.9 --vcf test.vcf.gz --ld rs3780269 rs111207562
PLINK v1.90b3.44 64-bit (17 Nov 2016) https://www.cog-genomics.org/plink2 (C) 2005-2016 Shaun Purcell, Christopher Chang GNU General Public License v3 Logging to plink.log. Options in effect: --ld rs3780269 rs111207562 --vcf test.vcf.gz
257652 MB RAM detected; reserving 128826 MB for main workspace. --vcf: plink-temporary.bed + plink-temporary.bim + plink-temporary.fam written. 2 variants loaded from .bim file. 503 people (0 males, 0 females, 503 ambiguous) loaded from .fam. Ambiguous sex IDs written to plink.nosex . Using 1 thread (no multithreaded calculations invoked). Before main variant filters, 503 founders and 0 nonfounders present. Calculating allele frequencies... done. 2 variants and 503 people pass filters and QC. Note: No phenotypes present.
--ld rs3780269 rs111207562:
R-sq = 0.0145356 D' = 0.17131
Haplotype Frequency Expectation under LE
In phase alleles are AC/GT