choishingwan / PRS-Tutorial

A tutorial on how to run basic polygenic risk score analysis
MIT License
68 stars 104 forks source link

subsetting target dataset prior to analysis #25

Closed EugeneEA closed 3 years ago

EugeneEA commented 3 years ago

Hi, I have a following question - should'nt I first filter taget dataset by SNP's which are present in base dataset? To later work only with a overlapping subset of SNP's. It seems critical thing to do before prunning. Best, Eugene

choishingwan commented 3 years ago

Prunning is mainly used for QC and PCA calculation. After obtaining the quality control metric, we will use the full data set for the PRS calculation. By that point, we'd actually do the overlap SNPs matching before we do the clumping to maximize the data. Hope this help.

EugeneEA commented 3 years ago

Thanks for the fast reply, it indeed helped