choishingwan / PRSice

A software package for calculating, applying, evaluating and plotting the results of polygenic risk scores
http://prsice.info
GNU General Public License v3.0
187 stars 90 forks source link

Phenotype data #317

Closed RafaFariasVarona closed 1 year ago

RafaFariasVarona commented 1 year ago

Hello, I'm writing to you because I'd like to know if it is possible to use PRSice-2 without knowing the phenotype. In my case, I'd like to calculate COVID-19 Polygenic Risk Scores. Firstly, I obtained the base data (GWAS associated to COVID-19) and target data (1000G European samples). Then, I followed the steps of your tutorial for PRS (https://choishingwan.github.io/PRS-Tutorial/) and I computed PRS for each individual using PLINK. I used PRSice-2 too, but I'm not sure about the results outputted by this program. PRSice-2 requires the phenotype, but I don't have that information available. Thus, I decided to use as phenotype the PRS values computed by PLINK and I obtained other PRS values. I used the following command in PRSice-2: Rscript PRSice.R \ --prsice PRSice_linux \ --base listasnp.txt \ --target EUR.QC \ --binary-target F \ --pheno EUR.covid \ (PLINK values as phenotype) --stat OR \ --thread 3 \

I noticed that the PRS values estimated by PRSice-2 differ a lot from those computed by PLINK. Additionally, I used "--no-regress" and I obtained other scores. Therefore, which methodology should I use? PLINK scores as phenotype or "-no-regress" option?

Thank you very much for your help. Regards, Rafa Farias

choishingwan commented 1 year ago

This is not an appropriate use of our Software and based on your comment I suggest you to first read our paper https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7612115/ to better understand what is a polygenic score and what are the optimization involved