privefl / bigsnpr

R package for the analysis of massive SNP arrays.
https://privefl.github.io/bigsnpr/
196 stars 44 forks source link

Related sample #465

Closed JuanJoMV closed 1 year ago

JuanJoMV commented 1 year ago

Dear Florian,

Thank you so much for your great software.

I am trying to compute polygenic scores for one phenotype in a sample of around 1600 participants from my lab. I am using LDpred2-auto.

This sample belongs to a twin registry so the sample includes both MZ and DZ twins. I guess I can compute the PGS as using a regular sample but I would have to take into account the relatedness of the sample in the regression model (to calculate R2).

I was planning to do it using a Linear Mixed-Effects model including family as a random effect. I guess it is also possible to include the relatedness matrix as a random effect but I am not aware of any software that deals with a n x n relatedness matrix in a regression model.

I would love to hear your thoughts about the best way to calculate PGS using LDpred2 in a twin sample.

Thank you so much in advance. With all good wishes, JuanJo

privefl commented 1 year ago

This is a bit out of scope here.

My take is that it might not matter too much that you have related individuals since I'm not sure it biases the correlation, but it does make the SE look too small. Maybe bootstrap would work here for getting the SE. Another easy solution would be to filter to some unrelated subsample.

JuanJoMV commented 1 year ago

Dear Florian,

Thank you so much for this! It is really helpful.

With all good wishes, JuanJo