Related work from TOPMed

A few notes:

The QC part of the pipeline is similar to the UKBB QC process
A brief overview of the steps:
- LD pruning to remove uncorrelated variants (via SNPRelate)
- Kinship estimation robust to pop. structure (via SNPRelate implementation of KING-robust)
- Estimates above provided to PC-AiR for PCA robust to relatedness, but not pop. structure (via GENESIS)
- It is interesting that PLINK offers "Relatedness Pruning" as an explicit step that PC-AiR embeds within the preprocessing steps (i.e. it runs PCA on a subset of unrelated samples)
- PCA robust to population structure using PC-Relate (via GENESIS)
- This requires the PCA vectors from the PC-AiR step
- The scaling strategy for this involves operating first on groups of samples (+ all variants) and then operating on pairs of results from sample blocks. That's a strategy I had thought about before but haven't seen implemented anywhere until this
- Association testing using kinship estimates as random effects and PCs as fixed effects (via GENESIS)
- Visualization beyond usual QQ/manhattan plots (via LocusZoom)
SNPRelate + GWASTools + GENESIS together all make for a pretty comprehensive toolkit

related-sciences / gwas-analysis