bulik / ldsc

LD Score Regression (LDSC)
GNU General Public License v3.0
642 stars 343 forks source link

HapMap3 variants #322

Open remomomo opened 3 years ago

remomomo commented 3 years ago

Hi,

I am comparing different PRS methods as part of a larger project in different biobanks.

We are trying to come to a common set of reference SNPs to use, and wanted to start by using the HapMap3 variants. As the original HapMap3 is in hg18, and many of the rsIDs in that data have changed over time, we wanted to make an updated version with the latest rsIDs and positions in hg19 and hg38 to facilitate harmonisation of variants across different biobanks.

However, I realised that simply mapping the locations of the old HapMap hg18 release to the newer genome builds hg19 and hg38 using liftOver causes us to lose some variants that seem to be present in other tools that use HapMap3 (PRScs, LDSC, GCTB).

Could you explain how you generated the HapMap3 reference SNPs used in ldsc, or point us towards a place where this is documented?

Best,

Remo

Al-Murphy commented 2 years ago

Hey @remomomo has this been answered? I'm wondering what the genome build of the HapMap3 SNPs is? I'm looking for a hg19 version, do you know where one is (either created using liftover or not)

remomomo commented 2 years ago

Hi, I haven’t heard back from them. We ended up performing mapping with liftover. You might be able to find something on the pan ukbb website, i think they computed ld scores for hapmap3 variants, although I’m unsure which genome build those are. https://pan-dev.ukbb.broadinstitute.org/downloads