JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
169 stars 54 forks source link

Increase in Statistical Power #153

Open pjordab opened 2 years ago

pjordab commented 2 years ago

Hi users and developers,

After the use of MTAG the mean chi2 of TRAIT_1 increased from 1.088 to 1.417.

According to the increase of mean chi (1-mean chi^2 MTAG)/(1-mean chi^2 GWAS), my GWAS equiv. (max) N should increase by 473% (from 224727 to 1064899), however it increases from 224727 to 310368 (138%).

Which ratio of increase in statistical power increase is most appropriate? Mainly for the calculation of the expected increase of R2 and to use the MTAG sumstats in further analysis.

Thanks in advance for your help, I attach the log file.

2021/12/16/03:47:14 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> <> MTAG: Multi-trait Analysis of GWAS <> Version: 1.0.8 <> (C) 2017 Omeed Maghzian, Raymond Walters, and Patrick Turley <> Harvard University Department of Economics / Broad Institute of MIT and Harvard <> GNU General Public License v3 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Note: It is recommended to run your own QC on the input before using this program. <> Software-related correspondence: maghzian@nber.org <> All other correspondence: paturley@broadinstitute.org <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Calling ./mtag.py \ --stream-stdout \ --sumstats TRAIT_1,TRAIT_2,TRAIT_3,TRAIT_4 \ --fdr \ --out ./ANALYSIS

2021/12/16/03:47:14 PM Beginning MTAG analysis... 2021/12/16/03:47:14 PM MTAG will use the Z column for analyses. 2021/12/16/03:47:30 PM Read in Trait 1 summary statistics (8557337 SNPs) from TRAIT_1 ... 2021/12/16/03:47:30 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:47:30 PM Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2021/12/16/03:47:30 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:47:30 PM Interpreting column names as follows: 2021/12/16/03:47:30 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.

2021/12/16/03:47:30 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2021/12/16/03:47:43 PM Read 8557337 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 551192 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 8006145 SNPs remain. 2021/12/16/03:47:48 PM Removed 0 SNPs with duplicated rs numbers (8006145 SNPs remain). 2021/12/16/03:47:49 PM Removed 0 SNPs with N < 149818.257102 (8006145 SNPs remain). 2021/12/16/03:49:12 PM Median value of SIGNED_SUMSTAT was -0.00122699386503, which seems sensible. 2021/12/16/03:49:12 PM Dropping snps with null values 2021/12/16/03:49:13 PM Metadata: 2021/12/16/03:49:13 PM Mean chi^2 = 1.088 2021/12/16/03:49:14 PM Lambda GC = 1.007 2021/12/16/03:49:14 PM Max chi^2 = 458.046 2021/12/16/03:49:14 PM 2610 Genome-wide significant SNPs (some may have been removed by filtering). 2021/12/16/03:49:14 PM Conversion finished at Thu Dec 16 15:49:14 2021 2021/12/16/03:49:14 PM Total time elapsed: 1.0m:44.03s 2021/12/16/03:49:27 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:49:27 PM Munging of Trait 1 complete. SNPs remaining: 8006145 2021/12/16/03:49:27 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

2021/12/16/03:49:53 PM Read in Trait 2 summary statistics (6967088 SNPs) from TRAIT_2 ... 2021/12/16/03:49:53 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:49:53 PM Munging Trait 2 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2021/12/16/03:49:53 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:49:53 PM Interpreting column names as follows: 2021/12/16/03:49:53 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.

2021/12/16/03:49:53 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2021/12/16/03:50:03 PM Read 6967088 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 6967088 SNPs remain. 2021/12/16/03:50:08 PM Removed 0 SNPs with duplicated rs numbers (6967088 SNPs remain). 2021/12/16/03:50:09 PM Removed 0 SNPs with N < 120050.470727 (6967088 SNPs remain). 2021/12/16/03:51:23 PM Median value of SIGNED_SUMSTAT was 0.00666666666667, which seems sensible. 2021/12/16/03:51:23 PM Dropping snps with null values 2021/12/16/03:51:24 PM Metadata: 2021/12/16/03:51:24 PM Mean chi^2 = 1.143 2021/12/16/03:51:24 PM Lambda GC = 1.114 2021/12/16/03:51:24 PM Max chi^2 = 83.718 2021/12/16/03:51:25 PM 298 Genome-wide significant SNPs (some may have been removed by filtering). 2021/12/16/03:51:25 PM Conversion finished at Thu Dec 16 15:51:25 2021 2021/12/16/03:51:25 PM Total time elapsed: 1.0m:31.77s 2021/12/16/03:51:36 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:51:36 PM Munging of Trait 2 complete. SNPs remaining: 6967088 2021/12/16/03:51:36 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

2021/12/16/03:52:00 PM Read in Trait 3 summary statistics (7087911 SNPs) from TRAIT_3 ... 2021/12/16/03:52:00 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:52:00 PM Munging Trait 3 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2021/12/16/03:52:00 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:52:00 PM Interpreting column names as follows: 2021/12/16/03:52:00 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.

2021/12/16/03:52:01 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2021/12/16/03:52:11 PM Read 7087911 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 7087911 SNPs remain. 2021/12/16/03:52:17 PM Removed 30 SNPs with duplicated rs numbers (7087881 SNPs remain). 2021/12/16/03:52:18 PM Removed 0 SNPs with N < 497212.0 (7087881 SNPs remain). 2021/12/16/03:53:30 PM Median value of SIGNED_SUMSTAT was 0.00967742, which seems sensible. 2021/12/16/03:53:31 PM Dropping snps with null values 2021/12/16/03:53:31 PM Metadata: 2021/12/16/03:53:32 PM Mean chi^2 = 2.838 2021/12/16/03:53:32 PM Lambda GC = 1.916 2021/12/16/03:53:32 PM Max chi^2 = 626.746 2021/12/16/03:53:32 PM 77637 Genome-wide significant SNPs (some may have been removed by filtering). 2021/12/16/03:53:32 PM Conversion finished at Thu Dec 16 15:53:32 2021 2021/12/16/03:53:32 PM Total time elapsed: 1.0m:31.65s 2021/12/16/03:53:44 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:53:44 PM Munging of Trait 3 complete. SNPs remaining: 7087911 2021/12/16/03:53:44 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

2021/12/16/03:53:56 PM Trait 3: Dropped 30 SNPs for duplicate values in the "snp_name" column 2021/12/16/03:54:10 PM Read in Trait 4 summary statistics (8037281 SNPs) from TRAIT_4 ... 2021/12/16/03:54:10 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:54:10 PM Munging Trait 4 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2021/12/16/03:54:10 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:54:10 PM Interpreting column names as follows: 2021/12/16/03:54:10 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.

2021/12/16/03:54:10 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2021/12/16/03:54:14 PM WARNING: 1 SNPs had P outside of (0,1]. The P column may be mislabeled. 2021/12/16/03:54:22 PM Read 8037281 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 1 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 8037280 SNPs remain. 2021/12/16/03:54:27 PM Removed 0 SNPs with duplicated rs numbers (8037280 SNPs remain). 2021/12/16/03:54:28 PM Removed 118477 SNPs with N < 38884.0 (7918803 SNPs remain). 2021/12/16/03:55:51 PM Median value of SIGNED_SUMSTAT was 0.00467653936087, which seems sensible. 2021/12/16/03:55:51 PM Dropping snps with null values 2021/12/16/03:55:52 PM Metadata: 2021/12/16/03:55:53 PM Mean chi^2 = 1.241 2021/12/16/03:55:53 PM Lambda GC = 1.146 2021/12/16/03:55:53 PM Max chi^2 = 1237.216 2021/12/16/03:55:53 PM 5884 Genome-wide significant SNPs (some may have been removed by filtering). 2021/12/16/03:55:53 PM Conversion finished at Thu Dec 16 15:55:53 2021 2021/12/16/03:55:53 PM Total time elapsed: 1.0m:42.96s 2021/12/16/03:56:05 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:56:05 PM Munging of Trait 4 complete. SNPs remaining: 7918803 2021/12/16/03:56:05 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

2021/12/16/03:56:24 PM Dropped 1227834 SNPs due to strand ambiguity, 6778311 SNPs remain in intersection after merging trait1 2021/12/16/03:56:44 PM Flipped the signs of of 2904828 SNPs to make them consistent with the effect allele orderings of the first trait. 2021/12/16/03:56:48 PM Dropped 0 SNPs due to strand ambiguity, 5812916 SNPs remain in intersection after merging trait2 2021/12/16/03:57:08 PM Flipped the signs of of 2808878 SNPs to make them consistent with the effect allele orderings of the first trait. 2021/12/16/03:57:12 PM Dropped 0 SNPs due to strand ambiguity, 5619935 SNPs remain in intersection after merging trait3 2021/12/16/03:57:36 PM Flipped the signs of of 171 SNPs to make them consistent with the effect allele orderings of the first trait. 2021/12/16/03:57:41 PM Dropped 0 SNPs due to strand ambiguity, 5589583 SNPs remain in intersection after merging trait4 2021/12/16/03:57:41 PM ... Merge of GWAS summary statistics complete. Number of SNPs: 5589583 2021/12/16/03:58:04 PM Using 5589583 SNPs to estimate Omega (0 SNPs excluded due to strand ambiguity) 2021/12/16/03:58:04 PM Estimating sigma.. 2021/12/16/04:00:59 PM Checking for positive definiteness .. 2021/12/16/04:00:59 PM Sigma hat: [[0.852 0.056 0.027 0.023] [0.056 1.006 0.02 0.001] [0.027 0.02 1.079 0.018] [0.023 0.001 0.018 1.149]] 2021/12/16/04:00:59 PM Mean chi^2 of SNPs used to estimate Omega is low for some SNPsMTAG may not perform well in this situation. 2021/12/16/04:00:59 PM Beginning estimation of Omega ... 2021/12/16/04:01:00 PM Using GMM estimator of Omega .. 2021/12/16/04:01:02 PM Checking for positive definiteness .. 2021/12/16/04:01:02 PM Completed estimation of Omega ... 2021/12/16/04:01:02 PM Beginning MTAG calculations... 2021/12/16/04:01:23 PM ... Completed MTAG calculations. 2021/12/16/04:01:23 PM Writing Phenotype 1 to file ... 2021/12/16/04:02:26 PM Writing Phenotype 2 to file ... 2021/12/16/04:03:27 PM Writing Phenotype 3 to file ... 2021/12/16/04:04:24 PM Writing Phenotype 4 to file ... 2021/12/16/04:05:22 PM Summary of MTAG results:

Trait # SNPs used ... MTAG mean chi^2 GWAS equiv. (max) N 1 ...TRAIT_1 5589583 ... 1.417 310368 2 ...TRAIT_2 5589583 ... 1.266 344208 3 ...TRAIT_3 5589583 ... 2.695 755133 4 ...TRAIT_4 5589583 ... 1.097 64413

[4 rows x 7 columns]

Estimated Omega: [[1.147e-06 6.176e-07 5.720e-07 2.991e-07] [6.176e-07 7.785e-07 3.509e-07 8.516e-08] [5.720e-07 3.509e-07 2.453e-06 7.109e-10] [2.991e-07 8.516e-08 7.109e-10 1.807e-06]]

(Correlation): [[1.000e+00 6.535e-01 3.410e-01 2.078e-01] [6.535e-01 1.000e+00 2.539e-01 7.180e-02] [3.410e-01 2.539e-01 1.000e+00 3.377e-04] [2.078e-01 7.180e-02 3.377e-04 1.000e+00]]

Estimated Sigma: [[0.852 0.056 0.027 0.023] [0.056 1.006 0.02 0.001] [0.027 0.02 1.079 0.018] [0.023 0.001 0.018 1.149]]

(Correlation): [[1. 0.06 0.028 0.023] [0.06 1. 0.019 0.001] [0.028 0.019 1. 0.016] [0.023 0.001 0.016 1. ]]

MTAG weight factors: (average across SNPs) [1.297 1.264 1.051 1.431]

2021/12/16/04:05:22 PM 2021/12/16/04:05:22 PM MTAG results saved to file. 2021/12/16/04:05:22 PM Beginning maxFDR calculations. Depending on the number of grid points specified, this might take some time... 2021/12/16/04:05:22 PM T=4 2021/12/16/04:24:09 PM Number of gridpoints to search: 342148 2021/12/16/04:24:09 PM Performing grid search using 1 cores. 2021/12/16/04:32:40 PM Grid search: 10.0 percent finished for . Time: 8.502 min 2021/12/16/04:41:19 PM Grid search: 20.0 percent finished for . Time: 17.152 min 2021/12/16/04:50:14 PM Grid search: 30.0 percent finished for . Time: 26.067 min 2021/12/16/04:59:03 PM Grid search: 40.0 percent finished for . Time: 34.891 min 2021/12/16/05:07:44 PM Grid search: 50.0 percent finished for . Time: 43.582 min 2021/12/16/05:16:29 PM Grid search: 60.0 percent finished for . Time: 52.322 min 2021/12/16/05:25:15 PM Grid search: 70.0 percent finished for . Time: 61.094 min 2021/12/16/05:33:54 PM Grid search: 80.0 percent finished for . Time: 69.746 min 2021/12/16/05:43:58 PM Grid search: 90.0 percent finished for . Time: 79.806 min 2021/12/16/05:54:06 PM Grid search: 100.0 percent finished for . Time: 89.944 min 2021/12/16/05:54:17 PM Saved calculations of fdr over grid points in ./ANALYSIS_fdr_mat.txt 2021/12/16/05:54:17 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/05:54:17 PM grid point indices for max FDR for each trait: [ 11674 16146 70667 197260] 2021/12/16/05:54:17 PM Maximum FDR 2021/12/16/05:54:17 PM Max FDR of Trait 1: 0.0164459283169 at probs = [0. 0. 0. 0. 0. 0. 0. 0.2 0. 0. 0.2 0.1 0. 0.5 0. 0. ] 2021/12/16/05:54:17 PM Max FDR of Trait 2: 0.373400675419 at probs = [0. 0. 0. 0. 0. 0. 0.1 0.1 0. 0. 0. 0.3 0.5 0. 0. 0. ] 2021/12/16/05:54:17 PM Max FDR of Trait 3: 7.00435326062e-06 at probs = [0. 0. 0. 0.3 0. 0. 0.1 0. 0. 0. 0.1 0.1 0.3 0. 0. 0.1] 2021/12/16/05:54:17 PM Max FDR of Trait 4: 0.0945359459771 at probs = [0. 0.3 0. 0.1 0. 0. 0. 0. 0.1 0. 0. 0.1 0.3 0. 0. 0.1] 2021/12/16/05:54:17 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/05:54:17 PM Completed FDR calculations. 2021/12/16/05:54:17 PM MTAG complete. Time elapsed: 2.0h:7.0m:3.19360303879s

paturley commented 2 years ago

The summary table in your log file seems to be abbreviated and it omits the GWAS n and the original chi2 statistics that used in the GWAS-equiv N calculation, so it's difficult to see what is going on.

On Fri, Dec 17, 2021 at 4:38 PM pjordab @.***> wrote:

Hi users and developers,

After the use of MTAG the mean chi2 of TRAIT_1 increased from 1.088 to 1.417.

According to the increase of mean chi (1-mean chi^2 MTAG)/(1-mean chi^2 GWAS), my GWAS equiv. (max) N should increase by 438% (from 224727 to 1064899), however it increases from 224727 to 310368 (138%).

Which ratio of increase in statistical power increase is most appropriate? Mainly for the calculation of the expected increase of R2 and to use the MTAG sumstats in further analysis.

Thanks in advance for your help, I attach the log file.

2021/12/16/03:47:14 PM

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> <> MTAG: Multi-trait Analysis of GWAS <> Version: 1.0.8 <> (C) 2017 Omeed Maghzian, Raymond Walters, and Patrick Turley <> Harvard University Department of Economics / Broad Institute of MIT and Harvard <> GNU General Public License v3

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Note: It is recommended to run your own QC on the input before using this program. <> Software-related correspondence: @. <> All other correspondence: @.

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Calling ./mtag.py --stream-stdout --sumstats TRAIT_1,TRAIT_2,TRAIT_3,TRAIT_4 \ --fdr --out ./ANALYSIS

2021/12/16/03:47:14 PM Beginning MTAG analysis... 2021/12/16/03:47:14 PM MTAG will use the Z column for analyses. 2021/12/16/03:47:30 PM Read in Trait 1 summary statistics (8557337 SNPs) from TRAIT_1 ... 2021/12/16/03:47:30 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:47:30 PM Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2021/12/16/03:47:30 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:47:30 PM Interpreting column names as follows: 2021/12/16/03:47:30 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.

2021/12/16/03:47:30 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2021/12/16/03:47:43 PM Read 8557337 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 551192 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 8006145 SNPs remain. 2021/12/16/03:47:48 PM Removed 0 SNPs with duplicated rs numbers (8006145 SNPs remain). 2021/12/16/03:47:49 PM Removed 0 SNPs with N < 149818.257102 (8006145 SNPs remain). 2021/12/16/03:49:12 PM Median value of SIGNED_SUMSTAT was -0.00122699386503, which seems sensible. 2021/12/16/03:49:12 PM Dropping snps with null values 2021/12/16/03:49:13 PM Metadata: 2021/12/16/03:49:13 PM Mean chi^2 = 1.088 2021/12/16/03:49:14 PM Lambda GC = 1.007 2021/12/16/03:49:14 PM Max chi^2 = 458.046 2021/12/16/03:49:14 PM 2610 Genome-wide significant SNPs (some may have been removed by filtering). 2021/12/16/03:49:14 PM Conversion finished at Thu Dec 16 15:49:14 2021 2021/12/16/03:49:14 PM Total time elapsed: 1.0m:44.03s 2021/12/16/03:49:27 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:49:27 PM Munging of Trait 1 complete. SNPs remaining: 8006145 2021/12/16/03:49:27 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

2021/12/16/03:49:53 PM Read in Trait 2 summary statistics (6967088 SNPs) from TRAIT_2 ... 2021/12/16/03:49:53 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:49:53 PM Munging Trait 2 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2021/12/16/03:49:53 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:49:53 PM Interpreting column names as follows: 2021/12/16/03:49:53 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.

2021/12/16/03:49:53 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2021/12/16/03:50:03 PM Read 6967088 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 6967088 SNPs remain. 2021/12/16/03:50:08 PM Removed 0 SNPs with duplicated rs numbers (6967088 SNPs remain). 2021/12/16/03:50:09 PM Removed 0 SNPs with N < 120050.470727 (6967088 SNPs remain). 2021/12/16/03:51:23 PM Median value of SIGNED_SUMSTAT was 0.00666666666667, which seems sensible. 2021/12/16/03:51:23 PM Dropping snps with null values 2021/12/16/03:51:24 PM Metadata: 2021/12/16/03:51:24 PM Mean chi^2 = 1.143 2021/12/16/03:51:24 PM Lambda GC = 1.114 2021/12/16/03:51:24 PM Max chi^2 = 83.718 2021/12/16/03:51:25 PM 298 Genome-wide significant SNPs (some may have been removed by filtering). 2021/12/16/03:51:25 PM Conversion finished at Thu Dec 16 15:51:25 2021 2021/12/16/03:51:25 PM Total time elapsed: 1.0m:31.77s 2021/12/16/03:51:36 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:51:36 PM Munging of Trait 2 complete. SNPs remaining: 6967088 2021/12/16/03:51:36 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

2021/12/16/03:52:00 PM Read in Trait 3 summary statistics (7087911 SNPs) from TRAIT_3 ... 2021/12/16/03:52:00 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:52:00 PM Munging Trait 3 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2021/12/16/03:52:00 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:52:00 PM Interpreting column names as follows: 2021/12/16/03:52:00 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.

2021/12/16/03:52:01 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2021/12/16/03:52:11 PM Read 7087911 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 7087911 SNPs remain. 2021/12/16/03:52:17 PM Removed 30 SNPs with duplicated rs numbers (7087881 SNPs remain). 2021/12/16/03:52:18 PM Removed 0 SNPs with N < 497212.0 (7087881 SNPs remain). 2021/12/16/03:53:30 PM Median value of SIGNED_SUMSTAT was 0.00967742, which seems sensible. 2021/12/16/03:53:31 PM Dropping snps with null values 2021/12/16/03:53:31 PM Metadata: 2021/12/16/03:53:32 PM Mean chi^2 = 2.838 2021/12/16/03:53:32 PM Lambda GC = 1.916 2021/12/16/03:53:32 PM Max chi^2 = 626.746 2021/12/16/03:53:32 PM 77637 Genome-wide significant SNPs (some may have been removed by filtering). 2021/12/16/03:53:32 PM Conversion finished at Thu Dec 16 15:53:32 2021 2021/12/16/03:53:32 PM Total time elapsed: 1.0m:31.65s 2021/12/16/03:53:44 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:53:44 PM Munging of Trait 3 complete. SNPs remaining: 7087911 2021/12/16/03:53:44 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

2021/12/16/03:53:56 PM Trait 3: Dropped 30 SNPs for duplicate values in the "snp_name" column 2021/12/16/03:54:10 PM Read in Trait 4 summary statistics (8037281 SNPs) from TRAIT_4 ... 2021/12/16/03:54:10 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:54:10 PM Munging Trait 4 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< 2021/12/16/03:54:10 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:54:10 PM Interpreting column names as follows: 2021/12/16/03:54:10 PM snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. z: Directional summary statistic as specified by --signed-sumstats.

2021/12/16/03:54:10 PM Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. 2021/12/16/03:54:14 PM WARNING: 1 SNPs had P outside of (0,1]. The P column may be mislabeled. 2021/12/16/03:54:22 PM Read 8037281 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 1 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 8037280 SNPs remain. 2021/12/16/03:54:27 PM Removed 0 SNPs with duplicated rs numbers (8037280 SNPs remain). 2021/12/16/03:54:28 PM Removed 118477 SNPs with N < 38884.0 (7918803 SNPs remain). 2021/12/16/03:55:51 PM Median value of SIGNED_SUMSTAT was 0.00467653936087, which seems sensible. 2021/12/16/03:55:51 PM Dropping snps with null values 2021/12/16/03:55:52 PM Metadata: 2021/12/16/03:55:53 PM Mean chi^2 = 1.241 2021/12/16/03:55:53 PM Lambda GC = 1.146 2021/12/16/03:55:53 PM Max chi^2 = 1237.216 2021/12/16/03:55:53 PM 5884 Genome-wide significant SNPs (some may have been removed by filtering). 2021/12/16/03:55:53 PM Conversion finished at Thu Dec 16 15:55:53 2021 2021/12/16/03:55:53 PM Total time elapsed: 1.0m:42.96s 2021/12/16/03:56:05 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/03:56:05 PM Munging of Trait 4 complete. SNPs remaining: 7918803 2021/12/16/03:56:05 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

2021/12/16/03:56:24 PM Dropped 1227834 SNPs due to strand ambiguity, 6778311 SNPs remain in intersection after merging trait1 2021/12/16/03:56:44 PM Flipped the signs of of 2904828 SNPs to make them consistent with the effect allele orderings of the first trait. 2021/12/16/03:56:48 PM Dropped 0 SNPs due to strand ambiguity, 5812916 SNPs remain in intersection after merging trait2 2021/12/16/03:57:08 PM Flipped the signs of of 2808878 SNPs to make them consistent with the effect allele orderings of the first trait. 2021/12/16/03:57:12 PM Dropped 0 SNPs due to strand ambiguity, 5619935 SNPs remain in intersection after merging trait3 2021/12/16/03:57:36 PM Flipped the signs of of 171 SNPs to make them consistent with the effect allele orderings of the first trait. 2021/12/16/03:57:41 PM Dropped 0 SNPs due to strand ambiguity, 5589583 SNPs remain in intersection after merging trait4 2021/12/16/03:57:41 PM ... Merge of GWAS summary statistics complete. Number of SNPs: 5589583 2021/12/16/03:58:04 PM Using 5589583 SNPs to estimate Omega (0 SNPs excluded due to strand ambiguity) 2021/12/16/03:58:04 PM Estimating sigma.. 2021/12/16/04:00:59 PM Checking for positive definiteness .. 2021/12/16/04:00:59 PM Sigma hat: [[0.852 0.056 0.027 0.023] [0.056 1.006 0.02 0.001] [0.027 0.02 1.079 0.018] [0.023 0.001 0.018 1.149]] 2021/12/16/04:00:59 PM Mean chi^2 of SNPs used to estimate Omega is low for some SNPsMTAG may not perform well in this situation. 2021/12/16/04:00:59 PM Beginning estimation of Omega ... 2021/12/16/04:01:00 PM Using GMM estimator of Omega .. 2021/12/16/04:01:02 PM Checking for positive definiteness .. 2021/12/16/04:01:02 PM Completed estimation of Omega ... 2021/12/16/04:01:02 PM Beginning MTAG calculations... 2021/12/16/04:01:23 PM ... Completed MTAG calculations. 2021/12/16/04:01:23 PM Writing Phenotype 1 to file ... 2021/12/16/04:02:26 PM Writing Phenotype 2 to file ... 2021/12/16/04:03:27 PM Writing Phenotype 3 to file ... 2021/12/16/04:04:24 PM Writing Phenotype 4 to file ... 2021/12/16/04:05:22 PM Summary of MTAG results:

Trait # SNPs used ... MTAG mean chi^2 GWAS equiv. (max) N 1 ...TRAIT_1 5589583 ... 1.417 310368 2 ...TRAIT_2 5589583 ... 1.266 344208 3 ...TRAIT_3 5589583 ... 2.695 755133 4 ...TRAIT_4 5589583 ... 1.097 64413

[4 rows x 7 columns]

Estimated Omega: [[1.147e-06 6.176e-07 5.720e-07 2.991e-07] [6.176e-07 7.785e-07 3.509e-07 8.516e-08] [5.720e-07 3.509e-07 2.453e-06 7.109e-10] [2.991e-07 8.516e-08 7.109e-10 1.807e-06]]

(Correlation): [[1.000e+00 6.535e-01 3.410e-01 2.078e-01] [6.535e-01 1.000e+00 2.539e-01 7.180e-02] [3.410e-01 2.539e-01 1.000e+00 3.377e-04] [2.078e-01 7.180e-02 3.377e-04 1.000e+00]]

Estimated Sigma: [[0.852 0.056 0.027 0.023] [0.056 1.006 0.02 0.001] [0.027 0.02 1.079 0.018] [0.023 0.001 0.018 1.149]]

(Correlation): [[1. 0.06 0.028 0.023] [0.06 1. 0.019 0.001] [0.028 0.019 1. 0.016] [0.023 0.001 0.016 1. ]]

MTAG weight factors: (average across SNPs) [1.297 1.264 1.051 1.431]

2021/12/16/04:05:22 PM 2021/12/16/04:05:22 PM MTAG results saved to file. 2021/12/16/04:05:22 PM Beginning maxFDR calculations. Depending on the number of grid points specified, this might take some time... 2021/12/16/04:05:22 PM T=4 2021/12/16/04:24:09 PM Number of gridpoints to search: 342148 2021/12/16/04:24:09 PM Performing grid search using 1 cores. 2021/12/16/04:32:40 PM Grid search: 10.0 percent finished for . Time: 8.502 min 2021/12/16/04:41:19 PM Grid search: 20.0 percent finished for . Time: 17.152 min 2021/12/16/04:50:14 PM Grid search: 30.0 percent finished for . Time: 26.067 min 2021/12/16/04:59:03 PM Grid search: 40.0 percent finished for . Time: 34.891 min 2021/12/16/05:07:44 PM Grid search: 50.0 percent finished for . Time: 43.582 min 2021/12/16/05:16:29 PM Grid search: 60.0 percent finished for . Time: 52.322 min 2021/12/16/05:25:15 PM Grid search: 70.0 percent finished for . Time: 61.094 min 2021/12/16/05:33:54 PM Grid search: 80.0 percent finished for . Time: 69.746 min 2021/12/16/05:43:58 PM Grid search: 90.0 percent finished for . Time: 79.806 min 2021/12/16/05:54:06 PM Grid search: 100.0 percent finished for . Time: 89.944 min 2021/12/16/05:54:17 PM Saved calculations of fdr over grid points in ./ANALYSIS_fdr_mat.txt 2021/12/16/05:54:17 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/05:54:17 PM grid point indices for max FDR for each trait: [ 11674 16146 70667 197260] 2021/12/16/05:54:17 PM Maximum FDR 2021/12/16/05:54:17 PM Max FDR of Trait 1: 0.0164459283169 at probs = [0.

            1. 0.2 0. 0. 0.2 0.1 0. 0.5 0. 0. ] 2021/12/16/05:54:17 PM Max FDR of Trait 2: 0.373400675419 at probs = [0.
          1. 0.1 0.1 0. 0. 0. 0.3 0.5 0. 0. 0. ] 2021/12/16/05:54:17 PM Max FDR of Trait 3: 7.00435326062e-06 at probs = [0. 0. 0. 0.3 0. 0. 0.1 0. 0. 0. 0.1 0.1 0.3 0. 0. 0.1] 2021/12/16/05:54:17 PM Max FDR of Trait 4: 0.0945359459771 at probs = [0. 0.3 0. 0.1 0. 0. 0. 0. 0.1 0. 0. 0.1 0.3 0. 0. 0.1] 2021/12/16/05:54:17 PM <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2021/12/16/05:54:17 PM Completed FDR calculations. 2021/12/16/05:54:17 PM MTAG complete. Time elapsed: 2.0h:7.0m:3.19360303879s

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/153, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5NDDBMNXPN2AGNLYK3UROUTXANCNFSM5KJYLXYA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

pjordab commented 2 years ago

It is the complete log file, I can send you the original one by mail if you prefer. I only modified the sumstats names.

The GWAS n is: 224727 GWAS equiv. (max) N: 310368 Trait 1 chi2 1.088 MTAG trait 1 chi2: 1.417 All are in the log file except the GWAS original N.

Thanks again!!

xchangchop commented 2 years ago

Hi pjordab, I got exactly the same issue. In the tutorial, it says "The GWAS mean chi^2 column is adjusted by the diagonal terms of Sigma." So, 1.088 is the un-adjusted chi2. The adjusted one is omitted in the results table. I just open a session to report this issue. Perhaps you have already figured out how to see the full table?

-Xiao