JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
169 stars 54 forks source link

GWAS correlation and GWAS equivalent N #95

Open MarioGuCBMR opened 4 years ago

MarioGuCBMR commented 4 years ago

Hi, I am comparing two big GWAS from the same paper. They are both big meta-analysis with the same sample, but from two different traits. For both sets of SNPs I have 27.000.000 SNPs so I run out of memory. Hence, I decided to take a small portion of the SNPs: 180.000, since it was the proportion of SNPs used in the original paper. I took this SNPs randomly. In theory, this should work just fine, right? I am only testing if MTAG does work.

My results end up worsening some p-values. I had 120 Genome-Wide significant SNPs and I end up getting around 60 for the first trait. I went to check the second, and the amount of Genome-Wide significant SNPs improved a lot. I think that is to be expected, due to the nature of the traits, though. However, in the results I do get strange figures that make me think that MTAG might have not run correctly:

Trait # SNPs used N (max) N (mean) GWAS mean chi^2 MTAG mean chi^2 GWAS equiv. (max) N 1 MTAG1.txt 153638 697693 498364 1.014 0.897 -5023474
2 MTAG2.txt 153638 806826 505661 1.782 1.599 618287

The first things was the GWAS equivalent max N, which ends up being negative!

Estimated Omega: [[-2.091e-07 5.282e-11] [ 5.282e-11 1.354e-06]]

(Correlation): [[nan nan] [nan 1.]]

And the second one is this correlation matrix.

Estimated Sigma: [[1.619 0.258] [0.258 1.024]]

(Correlation): [[1. 0.201] [0.201 1. ]]

MTAG weight factors: (average across SNPs) [0.842 0.835]

For both traits these are the headers that I used:

chr pos snpid a1 a2 freq beta se pval n 1 2957600 rs12409277 C T 0.1904 0.0136 0.0023 3.321e-09 630061 1 9329289 rs2071931 T C 0.2146 0.0165 0.0021 1.23e-14 630042 1 9346583 rs72642703 G A 0.8747 0.0183 0.0029 5.377e-10 485486

And these are the commands:

python mtag.py --sumstats MTAG2.txt,MTAG2.txt --out ./WHR_BMI_results --n_min 0.0 --use_beta_se --beta_name beta --se_name se --stream_stdout

I hope you know how to interpret these results! When I increase the number of SNPs (500.000) it raises an error when trying to calculate the standardized betas since it says that some SEs are 0, when they are not.

paturley commented 4 years ago

What happens if you calculate the z-score for your summary statistics and use the Z/N specification?

On Mon, May 25, 2020 at 1:15 PM MarioGuCBMR notifications@github.com wrote:

Hi, I am comparing two big GWAS from the same paper. They are both big meta-analysis with the same sample, but from two different traits. For both sets of SNPs I have 27.000.000 SNPs so I run out of memory. Hence, I decided to take a small portion of the SNPs: 180.000, since it was the proportion of SNPs used in the original paper. I took this SNPs randomly. In theory, this should work just fine, right? I am only testing if MTAG does work.

My results end up worsening some p-values. I had 120 Genome-Wide significant SNPs and I end up getting around 60 for the first trait. I went to check the second, and the amount of Genome-Wide significant SNPs improved a lot. I think that is to be expected, due to the nature of the traits, though. However, in the results I do get strange figures that make me think that MTAG might have not run correctly:

Trait # SNPs used N (max) N (mean) GWAS mean chi^2 MTAG mean chi^2 GWAS equiv. (max) N 1 MTAG1.txt 153638 697693 498364 1.014 0.897 -5023474 2 MTAG2.txt 153638 806826 505661 1.782 1.599 618287

The first things was the GWAS equivalent max N, which ends up being negative!

Estimated Omega: [[-2.091e-07 5.282e-11] [ 5.282e-11 1.354e-06]]

(Correlation): [[nan nan] [nan 1.]]

And the second one is this correlation matrix.

Estimated Sigma: [[1.619 0.258] [0.258 1.024]]

(Correlation): [[1. 0.201] [0.201 1. ]]

MTAG weight factors: (average across SNPs) [0.842 0.835]

For both traits these are the headers that I used:

chr pos snpid a1 a2 freq beta se pval n 1 2957600 rs12409277 C T 0.1904 0.0136 0.0023 3.321e-09 630061 1 9329289 rs2071931 T C 0.2146 0.0165 0.0021 1.23e-14 630042 1 9346583 rs72642703 G A 0.8747 0.0183 0.0029 5.377e-10 485486

And these are the commands:

python mtag.py --sumstats MTAG2.txt,MTAG2.txt --out ./WHR_BMI_results --n_min 0.0 --use_beta_se --beta_name beta --se_name se --stream_stdout

I hope you know how to interpret these results! When I increase the number of SNPs (500.000) it raises an error when trying to calculate the standardized betas since it says that some SEs are 0, when they are not.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/95, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5IFHTH6YPUVGOYO5F3RTKRULANCNFSM4NJVN6CQ .