Closed lcstoshio closed 11 months ago
Hi Lucas -- I think your understanding is correct. The coefficients refer to the regression coefficients estimated by fitting the linear regression. I don't use R but it looks that if you use the 'lm' function in R to fit the linear regression, 'coefficients' in the returned class 'lm' would be what you need.
Okay, thank you for the quick response it helped a lot.
I just came across another question, the sample size of my target data is about 2000 individuals (300 cases and 1700 controls) is there a right proportion that i should split my data between validating and testing?
I don't know if it's too little of a sample size to split or should I use the automatic parameters (phi auto and --meta).
The case number does appear to be on the smaller side. I think auto+meta might be a better choice.
Got it, thank you so much for the help!!
Dear Tian Ge,
Sorry if it's a dumb question, but I am trying to run PRScsx and I am stuck in how to get the coefficients for each ancestry in the linear combination.
For context, I am running PRScsx with three ancestries (european, african and native american) and got the individual scores for each ancestry in plink, but I don't know how to proceed from here to get a single final score.
From what I read I should do a linear regression in the validation dataset and from here learn the coefficients: lm (y ~ PRS_EUR + PRS_AFR + PRS_AMR + covariates)
But I don't know what are the coefficients from the results of the regression and how to proceed from here (is it the "Estimate"? I am running everything in R).
And after i got the coefficients I should just do this right? PRS <- coef_EUR PRS_EUR + coef_AFR PRS_AFR + coef_AMR * PRS_AMR lm (y ~ PRS + covariates) Calculate R2?
Thank you so much. Lucas