choishingwan / PRS-Tutorial

A tutorial on how to run basic polygenic risk score analysis
MIT License
70 stars 111 forks source link

Question on PRSice-2 section results & documentation #22

Open james16292 opened 3 years ago

james16292 commented 3 years ago

Hi and first of all thank you for your excellent paper/tutorial and documentation on conducting PRS analyses.

I am having an issue replicating one of the results in the PRSice-2 section of the tutorial . I can replicate the best-fit P-value threshold being 0.13995, but not the phenotypic variation being 0.209902. In fact the PRS.R2 I get in the EUR.summary file is 0.162601. As a sanity check I looked in the EUR.prsice file, and for the 0.3 p-value threshold the prs.r2 value is the one reported in the results of the PLINK section of the tutorial, so there doesn't seem to be anything majorly wrong with the analysis results.

The way I did the analysis to get these results:

  1. I ran the base and target data QC sections exactly as described
  2. Since the PRSice-2 section requires a EUR.eigenvec file with population PCs, after the QC I ran only the plink commands to get the PCs from the PLINK section
  3. Then I went to the PRSice-2 section and ran the EUR.cov - EUR.eigenvec file merge and ran the PRSice2 command to generate the results I reported previously.

Did I not follow the tutorial's intended workflow correctly? If it was correctly done, could you run this on your end and confirm the different result from the one in the documentation?

Some further notes

I apologise in advance if I have misunderstood some part of the tutorial or process, thank you again for your time and help.

tienpm7723 commented 2 years ago

I have the same question with you, i run PRSice on MacOSX and my result is best-fit P-value 0.13995 and 0.162601 PRS.R2 in EUR.summary file too, i don't know where's the number 0.209902. Appreciate any help us to more understand this tutorial!

choishingwan commented 2 years ago

Didn't realize I've forgot to reply to this. Did you use the QCed file download in the PRSice-2 section, or did you use the file generated from previous section. If I remember correctly, I've not updated the QCed file with the correct files (as I kinda forgot the password to that google drive....), so the R2 might differ a bit.

The most likely scenario is just that I have not updated the tutorial to reflect the correct number. Unfortunately, I haven't got time to do so lately, will put this on my to-do list and hopefully I will be able to address it in the near future.

Think I did addressed the --base-maf 0.05 issue with window as it seems to be correct now.

For the last question on why I did that, it is just to demonstrate that we can sometime do it with the program directly (and because I wrote PRSice). And yes, in this context, it does not do anything extra.

Hope this help. Happy holiday

Sam

tienpm7723 commented 2 years ago

Didn't realize I've forgot to reply to this. Did you use the QCed file download in the PRSice-2 section, or did you use the file generated from previous section. If I remember correctly, I've not updated the QCed file with the correct files (as I kinda forgot the password to that google drive....), so the R2 might differ a bit.

The most likely scenario is just that I have not updated the tutorial to reflect the correct number. Unfortunately, I haven't got time to do so lately, will put this on my to-do list and hopefully I will be able to address it in the near future.

Think I did addressed the --base-maf 0.05 issue with window as it seems to be correct now.

For the last question on why I did that, it is just to demonstrate that we can sometime do it with the program directly (and because I wrote PRSice). And yes, in this context, it does not do anything extra.

Hope this help. Happy holiday

Sam

Thanks for your help!

I have used both QCed file in PRSice-2 section and the file from QC section, the result are little difference but i still don't know where is the number 0.209902 mention in the end of PRSice-2 section.

My final result: QCed file:

Generated file:

choishingwan commented 2 years ago

You are right, I got 0.161237 for PRSice. Will update the document

michaelofrancis commented 1 year ago

Hi, I just did this tutorial today so I'm not an expert with the software but.... Using the provided QC file, I got this in EUR.summary:

Phenotype Set Threshold PRS.R2 Full.R2 Null.R2 Prevalence Coefficient Standard.Error P Num_SNP '- Base 0.13995 0.214442 0.391467 0.225349 - 36115 3212.41 3.81121e-26 85982

Is this correct? I am getting different R2 values than what has been posted here.

Also this page still says 0.3 is the best-fit threshold https://choishingwan.github.io/PRS-Tutorial/prsice/

Thanks!

emmayu001 commented 1 year ago

Hi, I just did this tutorial today so I'm not an expert with the software but.... Using the provided QC file, I got this in EUR.summary:

Phenotype Set Threshold PRS.R2 Full.R2 Null.R2 Prevalence Coefficient Standard.Error P Num_SNP '- Base 0.13995 0.214442 0.391467 0.225349 - 36115 3212.41 3.81121e-26 85982

Is this correct? I am getting different R2 values than what has been posted here.

Also this page still says 0.3 is the best-fit threshold https://choishingwan.github.io/PRS-Tutorial/prsice/

Thanks!

I just re-ran the code and got:

    "Pheno  Set Threshold   R2  P   Coefficient Standard.Error  Num_SNP
      - Base    0.13995 0.166117    3.81121e-26 36115   3212.41 85982"

I'm also confused where do these two numbers come from on this page:

    "Which P-value threshold generates the "best-fit" PRS?
    0.3"

    "How much phenotypic variation does the "best-fit" PRS explain?
    0.161237"
michtrofimov commented 1 year ago

The problem with p-value threshold is still up to date (06.06.2023) @choishingwan