Yves-CHEN / DENTIST

DENTIST (Detecting Errors iN analyses of summary staTISTics) is a QC tool for summary-data-based analyses.
GNU Lesser General Public License v3.0
21 stars 6 forks source link

Question about duplicated variants #21

Closed yanyul233 closed 1 year ago

yanyul233 commented 1 year ago

Hi! Thanks for developing DENTIST! I have a question about option --dup-threshold. I was wondering how the duplicated variants are used in the software. I cannot find a description about this in the paper or README so that I want to ask it here. Thanks in advance!

Yves-CHEN commented 1 year ago

thanks very much for your interest. In the paper, we have "To improve computational stability, we do not include variants in near-perfect LD with variant i (e.g., r2 > 0.95) in the variant set to compute".

As said above, we first define which SNPs are in perfect LD based on a pre-set threshold (this --dup-threshold ) for LD correlation(r^2). Then, only one of the SNPs is kept for DENTIST QC which provides a imputed z-score for this SNP. This imputed z-score is then assigned to all of the SNPs which are in perfect LD with the one used for QC.

yanyul233 commented 1 year ago

Hi @Yves-CHEN, thanks so much for the explanation. I'll close this issue.