Can we directly apply the PRS for prediction?

harryyiheyang commented 1 year ago

As I understand it, the PRScsx, P+T and other PRS scores are estimated based on z-scores, which implies that the genotypes used to calculate the scores are standardized. However, I'm not sure if it's necessary to also standardize the individual genotype data when applying the estimated score to predict a disease in practice.

I have asked this question to Dr. Tian Ge and he replied that "The output posterior SNP weights have been transformed to per-allele effect sizes. Therefore it is unnecessary to standardize the genotypes in the target dataset when applying the SNP weights. ". I am not sure this also holds for otters.

I would greatly appreciate it if you could provide some clarification on this matter.

daiqile96 commented 1 year ago

In all the PRS methods implemented (SDPR, PRScs, and P+T) in this OTTERS tool, the estimated eQTL effect sizes are 'standardized,' assuming that genotypes are standardized. This differs from the original PRScs tool by Dr. Tian Ge, which by default provides per-allele effect sizes.

I've compared using standardized genotypes versus non-standardized genotypes to impute gene expression in GTEx data, and I found that non-standardized genotypes can yield better imputation accuracy. It could be because the allele frequency in the target dataset is different from allele frequency in the dataset used to generate summary statistics, causing potential issues with standardization in real data.

Therefore, even though OTTERS assumes standardized genotypes, I recommend using the estimated eQTL effect sizes from OTTERS with non-standardized genotypes for gene expression prediction.

harryyiheyang commented 1 year ago

Thank you for your clarify!

daiqile96 / OTTERS

Can we directly apply the PRS for prediction? #4