JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
171 stars 54 forks source link

ValueError: cannot convert float NaN to integer #117

Open Kai6662 opened 4 years ago

Kai6662 commented 4 years ago

Hi,

I got some error messages when the mtag analysis were all most finished. How can I fix it? Could you help me ? Thank you. " 2020/11/10/03:17:41 PM ... Completed MTAG calculations. 2020/11/10/03:17:45 PM Writing Phenotype 1 to file ... 2020/11/10/03:19:16 PM Writing Phenotype 2 to file ... 2020/11/10/03:20:43 PM Writing Phenotype 3 to file ... 2020/11/10/03:22:09 PM Writing Phenotype 4 to file ... 2020/11/10/03:23:35 PM Writing Phenotype 5 to file ... 2020/11/10/03:25:03 PM Writing Phenotype 6 to file ... 2020/11/10/03:26:28 PM Writing Phenotype 7 to file ... 2020/11/10/03:27:54 PM Writing Phenotype 8 to file ... 2020/11/10/03:29:20 PM Writing Phenotype 9 to file ... 2020/11/10/03:30:46 PM Writing Phenotype 10 to file ... 2020/11/10/03:32:14 PM Writing Phenotype 11 to file ... 2020/11/10/03:33:40 PM Writing Phenotype 12 to file ... 2020/11/10/03:35:06 PM Writing Phenotype 13 to file ... 2020/11/10/03:36:34 PM cannot convert float NaN to integer Traceback (most recent call last): File "/hpc/dhl_ec/kcui/gwas/tools/mtag/mtag.py", line 1567, in mtag(args) File "/hpc/dhl_ec/kcui/gwas/tools/mtag/mtag.py", line 1452, in mtag write_summary(args, Zs, N_raw, Fs, mtag_betas, mtag_se, mtag_factor) File "/hpc/dhl_ec/kcui/gwas/tools/mtag/mtag.py", line 938, in write_summary summary_df.loc[p+1, 'GWAS equiv. (max) N'] = int(summary_df.loc[p+1, 'N (max)']*(summary_df.loc[p+1, 'MTAG mean chi^2'] - 1) / (summary_df.loc[p+1, 'GWAS mean chi^2'] - 1)) ValueError: cannot convert float NaN to integer 2020/11/10/03:36:34 PM Analysis terminated from error at Tue Nov 10 15:36:34 2020 2020/11/10/03:36:34 PM Total time elapsed: 2.0h:41.0m:42.05s"

Best regards, Kai

paturley commented 4 years ago

Hello Kai,

It looks like there may be a problem with your Phenotype 13 or 14. Can you verify that there are no NaN in any of the columns in the GWAS file. If that doesn't work, it may be that the mean chi2 of the GWAS is exactly 1. This value should be reported in the complete log file.

Best, Patrick

On Wed, Nov 11, 2020 at 5:00 AM Kai6662 notifications@github.com wrote:

Hi,

I got some error messages when the mtag analysis were all most finished. How can I fix it? Could you help me ? Thank you. " 2020/11/10/03:17:41 PM ... Completed MTAG calculations. 2020/11/10/03:17:45 PM Writing Phenotype 1 to file ... 2020/11/10/03:19:16 PM Writing Phenotype 2 to file ... 2020/11/10/03:20:43 PM Writing Phenotype 3 to file ... 2020/11/10/03:22:09 PM Writing Phenotype 4 to file ... 2020/11/10/03:23:35 PM Writing Phenotype 5 to file ... 2020/11/10/03:25:03 PM Writing Phenotype 6 to file ... 2020/11/10/03:26:28 PM Writing Phenotype 7 to file ... 2020/11/10/03:27:54 PM Writing Phenotype 8 to file ... 2020/11/10/03:29:20 PM Writing Phenotype 9 to file ... 2020/11/10/03:30:46 PM Writing Phenotype 10 to file ... 2020/11/10/03:32:14 PM Writing Phenotype 11 to file ... 2020/11/10/03:33:40 PM Writing Phenotype 12 to file ... 2020/11/10/03:35:06 PM Writing Phenotype 13 to file ... 2020/11/10/03:36:34 PM cannot convert float NaN to integer Traceback (most recent call last): File "/hpc/dhl_ec/kcui/gwas/tools/mtag/mtag.py", line 1567, in mtag(args) File "/hpc/dhl_ec/kcui/gwas/tools/mtag/mtag.py", line 1452, in mtag write_summary(args, Zs, N_raw, Fs, mtag_betas, mtag_se, mtag_factor) File "/hpc/dhl_ec/kcui/gwas/tools/mtag/mtag.py", line 938, in write_summary summary_df.loc[p+1, 'GWAS equiv. (max) N'] = int(summary_df.loc[p+1, 'N (max)']*(summary_df.loc[p+1, 'MTAG mean chi^2'] - 1) / (summary_df.loc[p+1, 'GWAS mean chi^2'] - 1)) ValueError: cannot convert float NaN to integer 2020/11/10/03:36:34 PM Analysis terminated from error at Tue Nov 10 15:36:34 2020 2020/11/10/03:36:34 PM Total time elapsed: 2.0h:41.0m:42.05s"

Best regards, Kai

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/117, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5ITKECZICMK2A7PWGTSPJOD3ANCNFSM4TRZD46A .

xiaofeiyu1992 commented 3 years ago

Hi Patrick,

I received the similar error here. I tried four traits with highly correlated (lets name it A, B, C, D). It worked fine with mtag analysis for two trait such as A and B or C or D. But, when I tried with three traits like A, B, C or A, B, D or A,B,D. I checked and there is not any NaN in the files. The cause problem may originate from low mean chi2 statistics. If so, can I still apply MTAG analysis? and the result can be trusted? In total, my sample size is around 1500.

Best, Xiaofei

The log file as below:

Calling ./mtag.py \ --force \ --stream-stdout \ --n-min 0.0 \ --ld-ref-panel ./LDscores/ \ --use-beta-se \ --sumstats A_trait.txt,B_trait.txt,C_trait.txt \ --out ./Ucirt_SL_WT

Beginning MTAG analysis... MTAG will use the provided BETA/SE columns for analyses. Read in Trait 1 summary statistics (51438 SNPs) from A_trait ... <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was 0.0001110725, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.992 WARNING: mean chi^2 may be too small. Lambda GC = 0.958 Max chi^2 = 21.469 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 1 complete. SNPs remaining: 51438 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 1 is less 1.02 - MTAG estimates may be unstable. Read in Trait 2 summary statistics (51438 SNPs) from B_trait.txt... <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 2 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was -1.45e-05, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.996 WARNING: mean chi^2 may be too small. Lambda GC = 0.967 Max chi^2 = 23.928 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 2 complete. SNPs remaining: 51438 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 2 is less 1.02 - MTAG estimates may be unstable. Read in Trait 3 summary statistics (51438 SNPs) from ,C_trait.txt... <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 3 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was -7.365e-05, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.988 WARNING: mean chi^2 may be too small. Lambda GC = 1.0 Max chi^2 = 19.564 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 3 complete. SNPs remaining: 51438 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 3 is less 1.02 - MTAG estimates may be unstable. Dropped 851 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait1 Dropped 0 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait2 Dropped 0 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait3 ... Merge of GWAS summary statistics complete. Number of SNPs: 50587 Using 50587 SNPs to estimate Omega (0 SNPs excluded due to strand ambiguity) Estimating sigma.. Checking for positive definiteness .. Sigma hat: [[1.096 0.343 0.333] [0.343 1.121 0.995] [0.333 0.995 1.136]] Mean chi^2 of SNPs used to estimate Omega is low for some SNPsMTAG may not perform well in this situation. Beginning estimation of Omega ... Using GMM estimator of Omega .. Checking for positive definiteness .. matrix is not positive definite, performing adjustment.. Warning: max number of iterations reached in adjustment procedure. Sigma matrix used is still non-positive-definite. Completed estimation of Omega ... Beginning MTAG calculations... ... Completed MTAG calculations. Writing Phenotype 1 to file ... Writing Phenotype 2 to file ... Writing Phenotype 3 to file ... cannot convert float NaN to integer Traceback (most recent call last): File "./mtag/mtag.py", line 1567, in mtag(args) File "./mtag/mtag.py", line 1452, in mtag write_summary(args, Zs, N_raw, Fs, mtag_betas, mtag_se, mtag_factor) File "./mtag/mtag.py", line 938, in write_summary summary_df.loc[p+1, 'GWAS equiv. (max) N'] = int(summary_df.loc[p+1, 'N (max)']*(summary_df.loc[p+1, 'MTAG mean chi^2'] - 1) / (summary_df.loc[p+1, 'GWAS mean chi^2'] - 1)) ValueError: cannot convert float NaN to integer

paturley commented 3 years ago

Sorry for the delayed response here. Had a grant due late last week.

Your traits 2 and 3 in that log have a really high LD score intercept. Any idea why that might be. Are the phenotypes they represent really highly correlated? Does it work when you do 3 traits but only include either trait 2 or trait 3 but not both?

It's also a bit strange that you have so few SNPs. Any thoughts on why your summary statistics are have so few SNPs?

On Thu, Jun 24, 2021 at 11:54 AM Xiaofei Yu @.***> wrote:

Hi Patrick,

I received the similar error here. I tried four traits with highly correlated (lets name it A, B, C, D). It worked fine with mtag analysis for two trait such as A and B or C or D. But, when I tried with three traits like A, B, C or A, B, D or A,B,D. Do you have any idea the casual of that? I checked and there is not any NaN in the files. The errors as below:

Calling ./mtag.py --force --stream-stdout --n-min 0.0 --ld-ref-panel ./LDscores/ --use-beta-se --sumstats A_trait.txt,B_trait.txt,C_trait.txt --out ./Ucirt_SL_WT

Beginning MTAG analysis... MTAG will use the provided BETA/SE columns for analyses. Read in Trait 1 summary statistics (51438 SNPs) from A_trait ...

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was 0.0001110725, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.992 WARNING: mean chi^2 may be too small. Lambda GC = 0.958 Max chi^2 = 21.469 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 1 complete. SNPs remaining: 51438

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 1 is less 1.02 - MTAG estimates may be unstable. Read in Trait 2 summary statistics (51438 SNPs) from B_trait.txt...

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 2 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was -1.45e-05, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.996 WARNING: mean chi^2 may be too small. Lambda GC = 0.967 Max chi^2 = 23.928 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 2 complete. SNPs remaining: 51438

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 2 is less 1.02 - MTAG estimates may be unstable. Read in Trait 3 summary statistics (51438 SNPs) from ,C_trait.txt...

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 3 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was -7.365e-05, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.988 WARNING: mean chi^2 may be too small. Lambda GC = 1.0 Max chi^2 = 19.564 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 3 complete. SNPs remaining: 51438

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 3 is less 1.02 - MTAG estimates may be unstable. Dropped 851 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait1 Dropped 0 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait2 Dropped 0 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait3 ... Merge of GWAS summary statistics complete. Number of SNPs: 50587 Using 50587 SNPs to estimate Omega (0 SNPs excluded due to strand ambiguity) Estimating sigma.. Checking for positive definiteness .. Sigma hat: [[1.096 0.343 0.333] [0.343 1.121 0.995] [0.333 0.995 1.136]] Mean chi^2 of SNPs used to estimate Omega is low for some SNPsMTAG may not perform well in this situation. Beginning estimation of Omega ... Using GMM estimator of Omega .. Checking for positive definiteness .. matrix is not positive definite, performing adjustment.. Warning: max number of iterations reached in adjustment procedure. Sigma matrix used is still non-positive-definite. Completed estimation of Omega ... Beginning MTAG calculations... ... Completed MTAG calculations. Writing Phenotype 1 to file ... Writing Phenotype 2 to file ... Writing Phenotype 3 to file ... cannot convert float NaN to integer Traceback (most recent call last): File "./mtag/mtag.py", line 1567, in mtag(args) File "./mtag/mtag.py", line 1452, in mtag write_summary(args, Zs, N_raw, Fs, mtag_betas, mtag_se, mtag_factor) File "./mtag/mtag.py", line 938, in write_summary summary_df.loc[p+1, 'GWAS equiv. (max) N'] = int(summary_df.loc[p+1, 'N (max)']*(summary_df.loc[p+1, 'MTAG mean chi^2'] - 1) / (summary_df.loc[p+1, 'GWAS mean chi^2'] - 1)) ValueError: cannot convert float NaN to integer

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/117#issuecomment-867754109, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5KTFAIUW7DEXTZKMN3TUNIKDANCNFSM4TRZD46A .

xiaofeiyu1992 commented 3 years ago

Hi Paturley,

Thanks for your reply. All three traits are highly correlated both phenotypic and genetically. The fewer SNPs are related with the genotyping method, which is the SNP array in my case. However, due to the error: The mean chi2 statistic is less than one, MTAG can't even produce valid standard errors in many cases. This method may not fit with my dataset due to lower overall statistical power.

Best, Xiaofei

On Mon, 5 Jul 2021 at 22:03, paturley @.***> wrote:

Sorry for the delayed response here. Had a grant due late last week.

Your traits 2 and 3 in that log have a really high LD score intercept. Any idea why that might be. Are the phenotypes they represent really highly correlated? Does it work when you do 3 traits but only include either trait 2 or trait 3 but not both?

It's also a bit strange that you have so few SNPs. Any thoughts on why your summary statistics are have so few SNPs?

On Thu, Jun 24, 2021 at 11:54 AM Xiaofei Yu @.***> wrote:

Hi Patrick,

I received the similar error here. I tried four traits with highly correlated (lets name it A, B, C, D). It worked fine with mtag analysis for two trait such as A and B or C or D. But, when I tried with three traits like A, B, C or A, B, D or A,B,D. Do you have any idea the casual of that? I checked and there is not any NaN in the files. The errors as below:

Calling ./mtag.py --force --stream-stdout --n-min 0.0 --ld-ref-panel ./LDscores/ --use-beta-se --sumstats A_trait.txt,B_trait.txt,C_trait.txt --out ./Ucirt_SL_WT

Beginning MTAG analysis... MTAG will use the provided BETA/SE columns for analyses. Read in Trait 1 summary statistics (51438 SNPs) from A_trait ...

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was 0.0001110725, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.992 WARNING: mean chi^2 may be too small. Lambda GC = 0.958 Max chi^2 = 21.469 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 1 complete. SNPs remaining: 51438

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 1 is less 1.02 - MTAG estimates may be unstable. Read in Trait 2 summary statistics (51438 SNPs) from B_trait.txt...

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 2 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was -1.45e-05, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.996 WARNING: mean chi^2 may be too small. Lambda GC = 0.967 Max chi^2 = 23.928 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 2 complete. SNPs remaining: 51438

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 2 is less 1.02 - MTAG estimates may be unstable. Read in Trait 3 summary statistics (51438 SNPs) from ,C_trait.txt...

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 3 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was -7.365e-05, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.988 WARNING: mean chi^2 may be too small. Lambda GC = 1.0 Max chi^2 = 19.564 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 3 complete. SNPs remaining: 51438

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 3 is less 1.02 - MTAG estimates may be unstable. Dropped 851 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait1 Dropped 0 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait2 Dropped 0 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait3 ... Merge of GWAS summary statistics complete. Number of SNPs: 50587 Using 50587 SNPs to estimate Omega (0 SNPs excluded due to strand ambiguity) Estimating sigma.. Checking for positive definiteness .. Sigma hat: [[1.096 0.343 0.333] [0.343 1.121 0.995] [0.333 0.995 1.136]] Mean chi^2 of SNPs used to estimate Omega is low for some SNPsMTAG may not perform well in this situation. Beginning estimation of Omega ... Using GMM estimator of Omega .. Checking for positive definiteness .. matrix is not positive definite, performing adjustment.. Warning: max number of iterations reached in adjustment procedure. Sigma matrix used is still non-positive-definite. Completed estimation of Omega ... Beginning MTAG calculations... ... Completed MTAG calculations. Writing Phenotype 1 to file ... Writing Phenotype 2 to file ... Writing Phenotype 3 to file ... cannot convert float NaN to integer Traceback (most recent call last): File "./mtag/mtag.py", line 1567, in mtag(args) File "./mtag/mtag.py", line 1452, in mtag write_summary(args, Zs, N_raw, Fs, mtag_betas, mtag_se, mtag_factor) File "./mtag/mtag.py", line 938, in write_summary summary_df.loc[p+1, 'GWAS equiv. (max) N'] = int(summary_df.loc[p+1, 'N (max)']*(summary_df.loc[p+1, 'MTAG mean chi^2'] - 1) / (summary_df.loc[p+1, 'GWAS mean chi^2'] - 1)) ValueError: cannot convert float NaN to integer

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/117#issuecomment-867754109, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AFBUB5KTFAIUW7DEXTZKMN3TUNIKDANCNFSM4TRZD46A

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/117#issuecomment-874306071, or unsubscribe https://github.com/notifications/unsubscribe-auth/AS4O77DXX4IAVWIUNWHJAB3TWIFX5ANCNFSM4TRZD46A .

paturley commented 3 years ago

Yeah, you might be right. MTAG requires being able to get a good estimate of the heritability and genetic correlations, which it can't easily do if there is very high phenotypic correlation (along with perfect sample overlap) and if you have very few SNPs.

On Fri, Jul 9, 2021 at 2:21 PM Xiaofei Yu @.***> wrote:

Hi Paturley,

Thanks for your reply. All three traits are highly correlated both phenotypic and genetically. The fewer SNPs are related with the genotyping method, which is the SNP array in my case. However, due to the error: The mean chi2 statistic is less than one, MTAG can't even produce valid standard errors in many cases. This method may not fit with my dataset due to lower overall statistical power.

Best, Xiaofei

On Mon, 5 Jul 2021 at 22:03, paturley @.***> wrote:

Sorry for the delayed response here. Had a grant due late last week.

Your traits 2 and 3 in that log have a really high LD score intercept. Any idea why that might be. Are the phenotypes they represent really highly correlated? Does it work when you do 3 traits but only include either trait 2 or trait 3 but not both?

It's also a bit strange that you have so few SNPs. Any thoughts on why your summary statistics are have so few SNPs?

On Thu, Jun 24, 2021 at 11:54 AM Xiaofei Yu @.***> wrote:

Hi Patrick,

I received the similar error here. I tried four traits with highly correlated (lets name it A, B, C, D). It worked fine with mtag analysis for two trait such as A and B or C or D. But, when I tried with three traits like A, B, C or A, B, D or A,B,D. Do you have any idea the casual of that? I checked and there is not any NaN in the files. The errors as below:

Calling ./mtag.py --force --stream-stdout --n-min 0.0 --ld-ref-panel ./LDscores/ --use-beta-se --sumstats A_trait.txt,B_trait.txt,C_trait.txt --out ./Ucirt_SL_WT

Beginning MTAG analysis... MTAG will use the provided BETA/SE columns for analyses. Read in Trait 1 summary statistics (51438 SNPs) from A_trait ...

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was 0.0001110725, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.992 WARNING: mean chi^2 may be too small. Lambda GC = 0.958 Max chi^2 = 21.469 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Munging of Trait 1 complete. SNPs remaining: 51438

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 1 is less 1.02 - MTAG estimates may be unstable. Read in Trait 2 summary statistics (51438 SNPs) from B_trait.txt...

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Munging Trait 2 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was -1.45e-05, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.996 WARNING: mean chi^2 may be too small. Lambda GC = 0.967 Max chi^2 = 23.928 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Munging of Trait 2 complete. SNPs remaining: 51438

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 2 is less 1.02 - MTAG estimates may be unstable. Read in Trait 3 summary statistics (51438 SNPs) from ,C_trait.txt...

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Munging Trait 3 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Interpreting column names as follows: snpid: Variant ID (e.g., rs number) n: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value beta: Directional summary statistic as specified by --signed-sumstats. a2: a2, interpreted as non-ref allele for signed sumstat. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. Read 51438 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 0 SNPs with out-of-bounds p-values. Removed 0 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 51438 SNPs remain. Removed 0 SNPs with duplicated rs numbers (51438 SNPs remain). Removed 0 SNPs with N < 0.0 (51438 SNPs remain). Median value of SIGNED_SUMSTAT was -7.365e-05, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.988 WARNING: mean chi^2 may be too small. Lambda GC = 1.0 Max chi^2 = 19.564 0 Genome-wide significant SNPs (some may have been removed by filtering).

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Munging of Trait 3 complete. SNPs remaining: 51438

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Warning: The mean chi2 statistic of trait 3 is less 1.02 - MTAG estimates may be unstable. Dropped 851 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait1 Dropped 0 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait2 Dropped 0 SNPs due to strand ambiguity, 50587 SNPs remain in intersection after merging trait3 ... Merge of GWAS summary statistics complete. Number of SNPs: 50587 Using 50587 SNPs to estimate Omega (0 SNPs excluded due to strand ambiguity) Estimating sigma.. Checking for positive definiteness .. Sigma hat: [[1.096 0.343 0.333] [0.343 1.121 0.995] [0.333 0.995 1.136]] Mean chi^2 of SNPs used to estimate Omega is low for some SNPsMTAG may not perform well in this situation. Beginning estimation of Omega ... Using GMM estimator of Omega .. Checking for positive definiteness .. matrix is not positive definite, performing adjustment.. Warning: max number of iterations reached in adjustment procedure. Sigma matrix used is still non-positive-definite. Completed estimation of Omega ... Beginning MTAG calculations... ... Completed MTAG calculations. Writing Phenotype 1 to file ... Writing Phenotype 2 to file ... Writing Phenotype 3 to file ... cannot convert float NaN to integer Traceback (most recent call last): File "./mtag/mtag.py", line 1567, in mtag(args) File "./mtag/mtag.py", line 1452, in mtag write_summary(args, Zs, N_raw, Fs, mtag_betas, mtag_se, mtag_factor) File "./mtag/mtag.py", line 938, in write_summary summary_df.loc[p+1, 'GWAS equiv. (max) N'] = int(summary_df.loc[p+1, 'N (max)']*(summary_df.loc[p+1, 'MTAG mean chi^2'] - 1) / (summary_df.loc[p+1, 'GWAS mean chi^2'] - 1)) ValueError: cannot convert float NaN to integer

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/117#issuecomment-867754109, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AFBUB5KTFAIUW7DEXTZKMN3TUNIKDANCNFSM4TRZD46A

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/117#issuecomment-874306071, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AS4O77DXX4IAVWIUNWHJAB3TWIFX5ANCNFSM4TRZD46A

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/117#issuecomment-877372987, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5JSE6II6RVJ6FMM5NTTW44ZLANCNFSM4TRZD46A .

JHahaWang commented 2 years ago

Hi Paturley, I think you indicated in your previous reply that too few SNPS or sample size will affect the accuracy of estimation by MTAG method. I want to use MTAG in the analysis of hundreds or thousands of SNPs, so that the results can be output, but the accuracy is guaranteed?

paturley commented 2 years ago

I've not tested the robustness of MTAG with fewer than hundreds of thousands of SNPs, so I'm not sure what the threshold is before MTAG results are reliable. One thing I might recommend is sending your SNPs through LDSC and looking at the standard error on the genetic correlation estimate. If the SE is small there, then you are likely OK.

On Tue, Aug 30, 2022 at 4:10 AM Destiny041 @.***> wrote:

Hi Paturley, I think you indicated in your previous reply that too few SNPS or sample size will affect the accuracy of estimation by MTAG method. I want to use MTAG in the analysis of hundreds or thousands of SNPs, so that the results can be output, but the accuracy is guaranteed?

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/117#issuecomment-1231311186, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5MUFZBZGYVRNEAWKB3V3W6YPANCNFSM4TRZD46A . You are receiving this because you commented.Message ID: @.***>

JHahaWang commented 2 years ago

thanks for your reply , I will give it a try as you suggested .