JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
172 stars 55 forks source link

Credibility of MTAG loci #61

Open mpx353 opened 5 years ago

mpx353 commented 5 years ago

Dear Omeed,

When running MTAG on the GWAS results of two related traits (UKB data, phenotypic correlation 0.6), with sample overlap, I picked up some loci that were not significant in the GWAS. However, I also noticed that for some variants that I did pick up in the GWAS, the mtag_pval is almost flat in the MTAG results. For example:

GWAS results (SNP | CHR | BP | EA | EAF | Pval | N | beta | SE)

rs12940610 | 17 | 64312463 | A | 0.422 | 7.40E-12 | 51399 | 0.0391154 | 0.00571079

Whereas in the MTAG results, the same SNP gives (Pval | N | beta | SE): 2.26E-03 | 51399 | 0.018 | 0.006

Previous threads here have discussed the role of the mean GWAS chi^2 to assess the reliability of MTAG results. I had a look at the log output, and indeed the mean GWAS chi^2 is below 1.1 for both traits.

Trait N (max) N (mean) # SNPs used GWAS mean chi^2 MTAG mean chi^2 GWAS equiv. (max) N 1 ..._MTAG_010518.txt 51884 50766 7168585 1.085 1.085 51993
2 ..._MTAG_010518.txt 51503 50394 7168585 1.049 1.055 57041

I was just wondering, if you could give any advice here on how to proceed and whether you still would recommend to use MTAG in this case. From the paper and discussion here, it is clear that the chi^2 is an important indicator for the reliability of the MTAG analysis. Also, could you perhaps speculate how it can happen that two moderate related traits derived from the same cohort cannot be used for MTAG? Could this also indicate a problem with the original GWAS or phenotype? The GWAS used for input had been filtered, such that MAF>0.1 and the INFO score>0.3.

Many thanks, Stefan

paturley commented 5 years ago

Hi Stefan,

Apologies for the delay in replying. If a SNP is significant before applying MTAG but not afterwards, it is likely because the beta estimate for the correlated trait has an opposite sign or an estimate that is much smaller in magnitude.

Your case is sort of on the edge of what I would think is fine for MTAG. Have you run the maxFDR analysis? I like to do that to get a sense of how reliable the MTAG hits will be.

I don't understand your last question. Can you clarify? You

On Sun, Jan 27, 2019 at 8:15 AM mpx353 notifications@github.com wrote:

Dear Omeed,

When running MTAG on the GWAS results of two related traits (UKB data, phenotypic correlation 0.6), with sample overlap, I picked up some loci that were not significant in the GWAS. However, I also noticed that for some variants that I did pick up in the GWAS, the mtag_pval is almost flat in the MTAG results. For example:

GWAS results (SNP | CHR | BP | EA | EAF | Pval | N | beta | SE)

rs12940610 | 17 | 64312463 | A | 0.422 | 7.40E-12 | 51399 | 0.0391154 | 0.00571079

Whereas in the MTAG results, the same SNP gives (Pval | N | beta | SE): 2.26E-03 | 51399 | 0.018 | 0.006

Previous threads here have discussed the role of the mean GWAS chi^2 to assess the reliability of MTAG results. I had a look at the log output, and indeed the mean GWAS chi^2 is below 1.1 for both traits.

Trait N (max) N (mean) # SNPs used GWAS mean chi^2 MTAG mean chi^2 GWAS equiv. (max) N 1 ..._MTAG_010518.txt 51884 50766 7168585 1.085 1.085 51993 2 ..._MTAG_010518.txt 51503 50394 7168585 1.049 1.055 57041

I was just wondering, if you could give any advice here on how to proceed and whether you still would recommend to use MTAG in this case. From the paper and discussion here, it is clear that the chi^2 is an important indicator for the reliability of the MTAG analysis. Also, could you perhaps speculate how it can happen that two moderate related traits derived from the same cohort cannot be used for MTAG? Could this also indicate a problem with the original GWAS or phenotype? The GWAS used for input had been filtered, such that MAF>0.1 and the INFO score>0.3.

Many thanks, Stefan

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/61, or mute the thread https://github.com/notifications/unsubscribe-auth/AUNA9czZjSf32jENX5_gqODgan3a4qQQks5vHaZbgaJpZM4aUtfj .

mpx353 commented 5 years ago

Many thanks for your reply. I had not performed the maxFDR yet, doing this now.

My last question was about the mean chi^2 value. I was just wondering why my mean chi^2 values are low. For example, do opposite beta estimates between correlated trait contribute to a low mean chi^2 value?

Thanks!

paturley commented 5 years ago

The mean chi2 scales with the sample size and heritability of the traits you use. With just 50k samples in your data, your mean chi2 stats seem about right to me. The increase in mean chi2 for the MTAG summary stats depends on the sample overlap, the genetic correlation, and the phenotypic correlation. Since you have total overlap, I would expect to see a slight gain in the mean chi2 but not a large one, which it looks like is the case.

Best, Patrick

On Tue, Jan 29, 2019 at 7:55 AM mpx353 notifications@github.com wrote:

Many thanks for your reply. I had not performed the maxFDR yet, doing this now.

My last question was about the mean chi^2 value. I was just wondering why my mean chi^2 values are low. For example, do opposite beta estimates between correlated trait contribute to a low mean chi^2 value?

Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/61#issuecomment-458529558, or mute the thread https://github.com/notifications/unsubscribe-auth/AUNA9QndQslfbfkVOCQQl8uRb4gJFdEKks5vIETZgaJpZM4aUtfj .

mpx353 commented 5 years ago

Thanks Patrick. I just got the max FDR results:

Maximum FDR Max FDR of Trait 1: 0.0985271605346 at probs = [0. 0.5 0. 0.5] Max FDR of Trait 2: 0.212188308953 at probs = [0. 0. 0.6 0.4]

Especially for trait 2, the FDR is very high, which may explain our results and probably reflects the fact that our sample size is low?

paturley commented 5 years ago

Yeah. You might consider calculating the maxFDR for each trait run separately as a baseline. The maxFDR will likely be pretty high then as well.

On Tue, Jan 29, 2019, 11:52 AM mpx353 <notifications@github.com wrote:

Thanks Patrick. I just got the max FDR results:

Maximum FDR Max FDR of Trait 1: 0.0985271605346 at probs = [0. 0.5 0. 0.5] Max FDR of Trait 2: 0.212188308953 at probs = [0. 0. 0.6 0.4]

Especially for trait 2, the FDR is very high, which may explain our results and probably reflects the fact that our sample size is low?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/61#issuecomment-458617334, or mute the thread https://github.com/notifications/unsubscribe-auth/AUNA9Z0AgmP5DxEaXEsqimZ3mDomfd6Iks5vIHxWgaJpZM4aUtfj .

mpx353 commented 5 years ago

Just for your interest: Since the maxFDR was quite high, I was interested in the actual overlap of significant variants in the GWAS vs. MTAG. See below the results:

Trait 1: (maxFDR ~ 0.10) No. of SNPs GWS in both GWAS and MTAG: 1484 (~0.017% of total number of SNPs) No. of SNPs GWS in GWAS but not MTAG: 3 No. of SNPs GWS in MTAG but not GWAS: 31 (only ~2% increase wr. to GWAS findings)

Trait 2: (maxFDR ~ 0.21) Significant in both GWAS and MTAG: 98 (=0.001%) Significant in GWAS but not MTAG: 36 Significant in MTAG but not GWAS: 146 (~9% increase wr. to GWAS findings)

For trait 2, there was a higher increase of significant variants compared to trait 1, which is in line with the higher maxFDR for trait 2. However, the increase is lower than what I would have expected from the maxFDR results. Theoretical, I would have expected to find ~21% more variants with MTAG for trait 2, but I suppose that's why we call it the maxFDR and not the meanFDR?

Thanks again for your help with this issue!

paturley commented 5 years ago

No problem. Glad to help.

On Wed, Jan 30, 2019 at 8:11 AM mpx353 notifications@github.com wrote:

Just for your interest: Since the maxFDR was quite high, I was interested in the actual overlap of significant variants in the GWAS vs. MTAG. See below the results:

Trait 1: (maxFDR ~ 0.10) No. of SNPs GWS in both GWAS and MTAG: 1484 (~0.017% of total number of SNPs) No. of SNPs GWS in GWAS but not MTAG: 3 No. of SNPs GWS in MTAG but not GWAS: 31 (only ~2% increase wr. to GWAS findings)

Trait 2: (maxFDR ~ 0.21) Significant in both GWAS and MTAG: 98 (=0.001%) Significant in GWAS but not MTAG: 36 Significant in MTAG but not GWAS: 146 (~9% increase wr. to GWAS findings)

For trait 2, there was a higher increase of significant variants compared to trait 1, which is in line with the higher maxFDR for trait 2. However, the increase is lower than what I would have expected from the maxFDR results. Theoretical, I would have expected to find ~21% more variants with MTAG for trait 2, but I suppose that's why we call it the maxFDR and not the meanFDR?

Thanks again for your help with this issue!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/61#issuecomment-458938138, or mute the thread https://github.com/notifications/unsubscribe-auth/AUNA9ZMKmhxk8_dbiG0xdCmrwyPs09_gks5vIZoNgaJpZM4aUtfj .

mpx353 commented 5 years ago

To finish/summarise this thread: do you think one could use MTAG in this case (provided awareness of the potential limitations implicated by the high FDR) or would you recommend to not use MTAG in this case? I couldn't find a clear threshold for the FDR, but I suspect 5% would be a good value? Thanks for your advice.

paturley commented 5 years ago

I don't know that there is a clear threshold. It really depends on the application. Have you run maxFDR on each of the sets of summary statistics separately? Sometimes that's a good baseline to compare how good the MTAG summary statistics are relative to the GWAS summary stats.

Patrick

On Mon, Feb 4, 2019 at 4:09 AM mpx353 notifications@github.com wrote:

To finish/summarise this thread: do you think one could use MTAG in this case (provided awareness of the potential limitations implicated by the high FDR) or would you recommend to not use MTAG in this case? I couldn't find a clear threshold for the FDR, but I suspect 5% would be a good value? Thanks for your advice.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/61#issuecomment-460175487, or mute the thread https://github.com/notifications/unsubscribe-auth/AUNA9aYtOAQrp087Zf50DJY2Lw7ria5Fks5vJ_jjgaJpZM4aUtfj .

Xuemin-Wang commented 4 years ago

Hi Patrick,

You mentioned that if a SNP is significant before applying MTAG but not afterwards, it is likely because the beta estimate for the correlated trait has an opposite sign or an estimate that is much smaller in magnitude. Does this mean that negatively correlated traits cannot included into the MTAG analysis? Will it unlikely result in the detection of significant gwas loci by MTAG?

Many thanks,

paturley commented 4 years ago

MTAG will work fine if traits have a negative genetic correlation as long as the assumptions of MTAG are met. More precisely, I should have said that SNPs may fall from significance if the other trait has an association that has an opposite sign than expected. So if the traits are negatively correlated in general, but some SNPs have estimates that have the same sign of association, a SNP that was previously statistically significant may lose statistical significance. But this is not the only reason that a SNP may lose statistical significance. Sometimes that occurs by pure chance due to sampling variation even if all assumptions are met.

On Thu, May 28, 2020 at 2:33 AM Xuemin Wang notifications@github.com wrote:

Hi Patrick,

You mentioned that if a SNP is significant before applying MTAG but not afterwards, it is likely because the beta estimate for the correlated trait has an opposite sign or an estimate that is much smaller in magnitude. Does this mean that negatively correlated traits cannot included into the MTAG analysis? Will it unlikely result in the detection of significant gwas loci by MTAG?

Many thanks,

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/61#issuecomment-635136132, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5NUC3SEILP3UNV6XYTRTYAU5ANCNFSM4GSS27RQ .