Closed thierrygosselin closed 5 years ago
snpgdsLDpruning() uses a sliding window method to prune SNPs, but snpgdsLDMat() returns LD of all pairs of SNPs.
Okay, I understand this, but then, how do you decide the threshold to prune your SNPs? Is there a way to visualize accurately the LD ?
Or the only solution is to prune SNPs using thresholds from 0 to 1, by = 0.1 and look at the whitelisted and blacklisted SNPs?
I know it's kind of a no brainer for human and model species, it's easier if you have a species with a reference genome and more difficult with denovo species...
with the test dataset sent by email:
1. Generating the LD mat
2. LD Pruning
Question 1:
With the threshold selected above:
r2 = 0.4
, we should not prune almost all markers. Based on the box plot and LD mat, it's all outliers above 0.4, the core (the IQR) of the data is below 0.1.Question 2:
I am doing something wrong or there's a bug in
SNPRelate::snpgdsLDMat
orSNPRelate::snpgdsLDpruning