After LD clumping the curated GWAS Catalog association set, the distribution of the quality control flags looks as this:
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+------+
|qualityControls |count |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+------+
|[Subsignificant p-value, Explained by a more significant variant in high LD (clumped)] |10928 |
|[Composite association, Subsignificant p-value, Incomplete genomic mapping, Variant inconsistency, No mapping in GnomAd, Variant not found in LD reference]|159 |
|[Subsignificant p-value, Variant not found in LD reference] |5486 |
|[Incomplete genomic mapping, Variant inconsistency, No mapping in GnomAd, Variant not found in LD reference] |30546 |
|[Variant not found in LD reference] |17338 |
|[] |302753|
|[Composite association, Subsignificant p-value, Variant inconsistency] |273 |
|[Subsignificant p-value, Palindrome alleles - cannot harmonize, Variant not found in LD reference] |710 |
|[Composite association, Incomplete genomic mapping, Variant inconsistency, No mapping in GnomAd, Variant not found in LD reference] |344 |
|[Subsignificant p-value, Palindrome alleles - cannot harmonize, Explained by a more significant variant in high LD (clumped)] |1539 |
|[Subsignificant p-value, Incomplete genomic mapping, Variant inconsistency, No mapping in GnomAd, Variant not found in LD reference] |8890 |
|[Palindrome alleles - cannot harmonize, Variant not found in LD reference] |2404 |
|[Subsignificant p-value, No mapping in GnomAd, Variant not found in LD reference] |215 |
|[Palindrome alleles - cannot harmonize] |44617 |
|[Composite association, Variant inconsistency, Explained by a more significant variant in high LD (clumped)] |323 |
|[Subsignificant p-value, Palindrome alleles - cannot harmonize] |9075 |
|[Composite association, Variant inconsistency] |525 |
|[Composite association, Subsignificant p-value, Variant inconsistency, Explained by a more significant variant in high LD (clumped)] |152 |
|[No mapping in GnomAd, Variant not found in LD reference] |754 |
|[Explained by a more significant variant in high LD (clumped)] |37906 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+------+
There are QC checks upstreams that already invalidates associations eg. No mapping in GnomAd or Composite association. However at later stages, these associations are seemingly considered and got further flags eg. Explained by a more significant variant in high LD (clumped) or Variant not found in LD reference. I think this makes not much sense and once an associations is flagged they should be omitted from downstream processes/flags.
After LD clumping the curated GWAS Catalog association set, the distribution of the quality control flags looks as this:
There are QC checks upstreams that already invalidates associations eg.
No mapping in GnomAd
orComposite association
. However at later stages, these associations are seemingly considered and got further flags eg.Explained by a more significant variant in high LD (clumped)
orVariant not found in LD reference
. I think this makes not much sense and once an associations is flagged they should be omitted from downstream processes/flags.