Open kys21207 opened 3 years ago
Hey. Not totally sure what is driving the discrepancy, but there are a number of possibilities. MTAG's SNP filters are a little different than LDSC's filters, so it could be that the correlation is different for the different sets of SNPs. Also MTAG reports the correlation of marginal SNP effects whereas LDSC reports the correlation of conditional SNP effects. It's also possible that these estimates aren't statistically that different. MTAG doesn't report standard errors on it's correlation estimate (because it's not a genetic correlation estimation method), but the 95% CI on your LDSC estimat looks like it extends down to ~37%. My guess is the width on the MTAG correlation estimate would be approximately as wide, which means it could be as large as 37%.
All that said, these two estimates are not meant to be comparable, so I'm not sure that you need to worry about it.
On Fri, Dec 11, 2020 at 11:44 AM Kijoung Song notifications@github.com wrote:
Hi, I found a huge discrepancy in the genetic correlation & mean of chi^2 between LDSC and MTAG. LDSC = 0.47 vs. MTAG = 0.27 Do you have any idea? Please help. LDSC: Heritability of phenotype 1
Total Observed scale h2: 0.7912 (0.0769) Lambda GC: 1.1364 Mean Chi^2: 1.2258 Intercept: 0.9805 (0.0145) Ratio < 0 (usually indicates GC correction). Heritability of phenotype 2/2
Total Observed scale h2: 0.3884 (0.0352) Lambda GC: 1.1779 Mean Chi^2: 1.2648 Intercept: 0.9797 (0.0121) Ratio < 0 (usually indicates GC correction). Genetic Covariance
Total Observed scale gencov: 0.2621 (0.0281) Mean z1*z2: 0.1988 Intercept: 0.0757 (0.0073) Genetic Correlation
Genetic Correlation: 0.4729 (0.0446) Z-score: 10.606 P: 2.794e-26
MTAG:
2020/12/11/02:44:52 PM Beginning MTAG analysis... 2020/12/11/02:44:52 PM MTAG will use the Z column for analyses. 2020/12/11/02:45:17 PM Read in Trait 1 summary statistics (13725152 SNPs) from IRM1/data1.txt ... 2020/12/11/02:45:17 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:45:17 PM Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2020/12/11/02:45:17 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:45:17 PM Interpreting column names as follows: 2020/12/11/02:45:17 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.
2020/12/11/02:45:18 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2020/12/11/02:45:44 PM Read 13725152 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 13725152 SNPs remain. 2020/12/11/02:45:58 PM Removed 32650 SNPs with duplicated rs numbers (13692502 SNPs remain). 2020/12/11/02:46:00 PM Removed 0 SNPs with N < 0.0 (13692502 SNPs remain). 2020/12/11/02:48:25 PM Median value of SIGNED_SUMSTAT was -0.0031139881202, which seems sensible. 2020/12/11/02:48:25 PM Dropping snps with null values 2020/12/11/02:48:26 PM Metadata: 2020/12/11/02:48:28 PM Mean chi^2 = 1.317 2020/12/11/02:48:28 PM Lambda GC = 1.064 2020/12/11/02:48:28 PM Max chi^2 = 1344.936 2020/12/11/02:48:28 PM 22929 Genome-wide significant SNPs (some may have been removed by filtering). 2020/12/11/02:48:28 PM Conversion finished at Fri Dec 11 14:48:28 2020 2020/12/11/02:48:28 PM Total time elapsed: 3.0m:11.55s 2020/12/11/02:48:54 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:48:54 PM Munging of Trait 1 complete. SNPs remaining: 13725152 2020/12/11/02:48:54 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
2020/12/11/02:49:23 PM Trait 1: Dropped 32650 SNPs for duplicate values in the "snp_name" column 2020/12/11/02:49:52 PM Read in Trait 2 summary statistics (17041690 SNPs) from IRM1/data2.txt ... 2020/12/11/02:49:52 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:49:52 PM Munging Trait 2 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2020/12/11/02:49:52 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:49:52 PM Interpreting column names as follows: 2020/12/11/02:49:52 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.
2020/12/11/02:49:52 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2020/12/11/02:50:20 PM Read 17041690 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 17041690 SNPs remain. 2020/12/11/02:50:36 PM Removed 45009 SNPs with duplicated rs numbers (16996681 SNPs remain). 2020/12/11/02:50:38 PM Removed 0 SNPs with N < 0.0 (16996681 SNPs remain). 2020/12/11/02:53:18 PM Median value of SIGNED_SUMSTAT was 0.00641381960098, which seems sensible. 2020/12/11/02:53:19 PM Dropping snps with null values 2020/12/11/02:53:20 PM Metadata: 2020/12/11/02:53:22 PM Mean chi^2 = 1.319 2020/12/11/02:53:23 PM Lambda GC = 1.067 2020/12/11/02:53:23 PM Max chi^2 = 1335.503 2020/12/11/02:53:23 PM 26559 Genome-wide significant SNPs (some may have been removed by filtering). 2020/12/11/02:53:23 PM Conversion finished at Fri Dec 11 14:53:23 2020 2020/12/11/02:53:23 PM Total time elapsed: 3.0m:31.33s 2020/12/11/02:53:53 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:53:53 PM Munging of Trait 2 complete. SNPs remaining: 17041690 2020/12/11/02:53:53 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:54:29 PM Trait 2: Dropped 45009 SNPs for duplicate values in the "snp_name" column 2020/12/11/02:54:39 PM Dropped 2123572 SNPs due to strand ambiguity, 11568930 SNPs remain in intersection after merging trait1 2020/12/11/02:55:13 PM Dropped 14210 SNPs due to inconsistent allele pairs from phenotype 2. 11554679 SNPs remain. 2020/12/11/02:55:22 PM Flipped the signs of of 1 SNPs to make them consistent with the effect allele orderings of the first trait. 2020/12/11/02:55:30 PM Dropped 0 SNPs due to strand ambiguity, 11554679 SNPs remain in intersection after merging trait2 2020/12/11/02:55:30 PM ... Merge of GWAS summary statistics complete. Number of SNPs: 11554679 2020/12/11/02:56:08 PM Using 11554679 SNPs to estimate Omega (0 SNPs excluded due to strand ambiguity) 2020/12/11/02:56:08 PM Estimating sigma.. 2020/12/11/02:57:51 PM Checking for positive definiteness .. 2020/12/11/02:57:51 PM Sigma hat: [[0.968 0.071] [0.071 0.974]] 2020/12/11/02:57:52 PM Beginning estimation of Omega ... 2020/12/11/02:57:53 PM Using GMM estimator of Omega .. 2020/12/11/02:57:54 PM Checking for positive definiteness .. 2020/12/11/02:57:54 PM Completed estimation of Omega ... 2020/12/11/02:57:54 PM Beginning MTAG calculations... 2020/12/11/02:59:19 PM ... Completed MTAG calculations. 2020/12/11/02:59:19 PM Writing Phenotype 1 to file ... 2020/12/11/03:00:46 PM Writing Phenotype 2 to file ... 2020/12/11/03:02:14 PM Summary of MTAG results:
Trait ... GWAS equiv. (max) N 1 IRM1/data1.txt ... 15521 2 IRM1/data2.txt ... 35316
[2 rows x 7 columns]
Estimated Omega: [[2.370e-05 4.629e-06] [4.629e-06 1.235e-05]]
(Correlation): [[1. 0.271] [0.271 1. ]]
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/121, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5IW72US4FDPUWXF7ETSUJD6RANCNFSM4UW42FUA .
Hi, I found a huge discrepancy in the genetic correlation & mean of chi^2 between LDSC and MTAG. LDSC = 0.47 vs. MTAG = 0.27 Do you have any idea? Please help.
LDSC: Heritability of phenotype 1
Total Observed scale h2: 0.7912 (0.0769) Lambda GC: 1.1364 Mean Chi^2: 1.2258 Intercept: 0.9805 (0.0145) Ratio < 0 (usually indicates GC correction).
Heritability of phenotype 2/2
Total Observed scale h2: 0.3884 (0.0352) Lambda GC: 1.1779 Mean Chi^2: 1.2648 Intercept: 0.9797 (0.0121) Ratio < 0 (usually indicates GC correction).
Genetic Covariance
Total Observed scale gencov: 0.2621 (0.0281) Mean z1*z2: 0.1988 Intercept: 0.0757 (0.0073)
Genetic Correlation
Genetic Correlation: 0.4729 (0.0446) Z-score: 10.606 P: 2.794e-26
MTAG:
2020/12/11/02:44:52 PM Beginning MTAG analysis... 2020/12/11/02:44:52 PM MTAG will use the Z column for analyses. 2020/12/11/02:45:17 PM Read in Trait 1 summary statistics (13725152 SNPs) from IRM1/data1.txt ... 2020/12/11/02:45:17 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:45:17 PM Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2020/12/11/02:45:17 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:45:17 PM Interpreting column names as follows: 2020/12/11/02:45:17 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.
2020/12/11/02:45:18 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2020/12/11/02:45:44 PM Read 13725152 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 13725152 SNPs remain. 2020/12/11/02:45:58 PM Removed 32650 SNPs with duplicated rs numbers (13692502 SNPs remain). 2020/12/11/02:46:00 PM Removed 0 SNPs with N < 0.0 (13692502 SNPs remain). 2020/12/11/02:48:25 PM Median value of SIGNED_SUMSTAT was -0.0031139881202, which seems sensible. 2020/12/11/02:48:25 PM Dropping snps with null values 2020/12/11/02:48:26 PM Metadata: 2020/12/11/02:48:28 PM Mean chi^2 = 1.317 2020/12/11/02:48:28 PM Lambda GC = 1.064 2020/12/11/02:48:28 PM Max chi^2 = 1344.936 2020/12/11/02:48:28 PM 22929 Genome-wide significant SNPs (some may have been removed by filtering). 2020/12/11/02:48:28 PM Conversion finished at Fri Dec 11 14:48:28 2020 2020/12/11/02:48:28 PM Total time elapsed: 3.0m:11.55s 2020/12/11/02:48:54 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:48:54 PM Munging of Trait 1 complete. SNPs remaining: 13725152 2020/12/11/02:48:54 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
2020/12/11/02:49:23 PM Trait 1: Dropped 32650 SNPs for duplicate values in the "snp_name" column 2020/12/11/02:49:52 PM Read in Trait 2 summary statistics (17041690 SNPs) from IRM1/data2.txt ... 2020/12/11/02:49:52 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:49:52 PM Munging Trait 2 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2020/12/11/02:49:52 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:49:52 PM Interpreting column names as follows: 2020/12/11/02:49:52 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.
2020/12/11/02:49:52 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2020/12/11/02:50:20 PM Read 17041690 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 17041690 SNPs remain. 2020/12/11/02:50:36 PM Removed 45009 SNPs with duplicated rs numbers (16996681 SNPs remain). 2020/12/11/02:50:38 PM Removed 0 SNPs with N < 0.0 (16996681 SNPs remain). 2020/12/11/02:53:18 PM Median value of SIGNED_SUMSTAT was 0.00641381960098, which seems sensible. 2020/12/11/02:53:19 PM Dropping snps with null values 2020/12/11/02:53:20 PM Metadata: 2020/12/11/02:53:22 PM Mean chi^2 = 1.319 2020/12/11/02:53:23 PM Lambda GC = 1.067 2020/12/11/02:53:23 PM Max chi^2 = 1335.503 2020/12/11/02:53:23 PM 26559 Genome-wide significant SNPs (some may have been removed by filtering). 2020/12/11/02:53:23 PM Conversion finished at Fri Dec 11 14:53:23 2020 2020/12/11/02:53:23 PM Total time elapsed: 3.0m:31.33s 2020/12/11/02:53:53 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2020/12/11/02:53:53 PM Munging of Trait 2 complete. SNPs remaining: 17041690 2020/12/11/02:53:53 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
2020/12/11/02:54:29 PM Trait 2: Dropped 45009 SNPs for duplicate values in the "snp_name" column 2020/12/11/02:54:39 PM Dropped 2123572 SNPs due to strand ambiguity, 11568930 SNPs remain in intersection after merging trait1 2020/12/11/02:55:13 PM Dropped 14210 SNPs due to inconsistent allele pairs from phenotype 2. 11554679 SNPs remain. 2020/12/11/02:55:22 PM Flipped the signs of of 1 SNPs to make them consistent with the effect allele orderings of the first trait. 2020/12/11/02:55:30 PM Dropped 0 SNPs due to strand ambiguity, 11554679 SNPs remain in intersection after merging trait2 2020/12/11/02:55:30 PM ... Merge of GWAS summary statistics complete. Number of SNPs: 11554679 2020/12/11/02:56:08 PM Using 11554679 SNPs to estimate Omega (0 SNPs excluded due to strand ambiguity) 2020/12/11/02:56:08 PM Estimating sigma.. 2020/12/11/02:57:51 PM Checking for positive definiteness .. 2020/12/11/02:57:51 PM Sigma hat: [[0.968 0.071] [0.071 0.974]] 2020/12/11/02:57:52 PM Beginning estimation of Omega ... 2020/12/11/02:57:53 PM Using GMM estimator of Omega .. 2020/12/11/02:57:54 PM Checking for positive definiteness .. 2020/12/11/02:57:54 PM Completed estimation of Omega ... 2020/12/11/02:57:54 PM Beginning MTAG calculations... 2020/12/11/02:59:19 PM ... Completed MTAG calculations. 2020/12/11/02:59:19 PM Writing Phenotype 1 to file ... 2020/12/11/03:00:46 PM Writing Phenotype 2 to file ... 2020/12/11/03:02:14 PM Summary of MTAG results:
Trait ... GWAS equiv. (max) N 1 IRM1/data1.txt ... 15521
2 IRM1/data2.txt ... 35316
[2 rows x 7 columns]
Estimated Omega: [[2.370e-05 4.629e-06] [4.629e-06 1.235e-05]]
(Correlation): [[1. 0.271] [0.271 1. ]]