privefl / bigsnpr

R package for the analysis of massive SNP arrays.
https://privefl.github.io/bigsnpr/
196 stars 44 forks source link

LD Metrics Using 1000 Genomes and Beta Values Returning NA #518

Open Taewoong-Ha opened 1 month ago

Taewoong-Ha commented 1 month ago

Hi,

I am currently building population-specific PRS using LDpred2, and I have a couple of questions:

  1. It is recommended to use at least 2,000 individuals to build LD matrices. I am using the 1000 Genomes Project populations (EUR, EAS, SAS, AMR, AFR) to build LD matrices for different ancestries. I have seen some papers following a similar approach, but each population has around 500 individuals on average. Is this okay to proceed with, or should I use the LD metrics provided by LDpred2, such as HM3 and HM3+, regardless of ancestry?

  2. I am using the LDpred2 grid model, but when the parameter "p" is low, all the beta values come out as NA, and consequently, the PRS also results in NA values. -> I saw a similar issue where the answer was that this can happen when "p" is low. Is this really fine? Could the small sample size used for building the LD metrics be contributing to this issue? Would using return_sampling_betas = TRUE help resolve this issue?

Thank you for your help!

privefl commented 1 month ago
Taewoong-Ha commented 1 month ago

Thank you for your assistance!

privefl commented 1 month ago
Taewoong-Ha commented 1 month ago