Open shreya2031 opened 7 months ago
Have you checked the definition of effect allele in your summary data? The effect alleles may be mismatched between GWAS and LD reference. In ma format, A1 is the effect allele (see here https://yanglab.westlake.edu.cn/software/gcta/#COJO).
Hi @anglixue,
I have been trying to run mtCOJO on a trait (GWAS summary available here: https://figshare.com/articles/dataset/scz2022/19426775?file=34517828) while adjusting for another trait (GWAS summary available here: https://conservancy.umn.edu/handle/11299/241912 filename: GSCAN_CigDay_2022_GWAS_SUMMARY_STATS_EUR.txt.gz) using GCTA v1.94.1 and this is the error message I received:
Error: there are too many SNPs that have large difference in allele frequency. Please check the GWAS summary data. An error occurs, please check the options or data
This is the command I used: ./gcta --bfile /home/1000G/ALL.chr1-22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes --mtcojo-file data_list_scz_cigday.txt --ref-ld-chr /home/gcta/eur_w_ld_chr/ --w-ld-chr /home/gcta/eur_w_ld_chr/ --out mtcojo_scz_cigday
This is the log file:
Analysis started at 16:32:44 PST on Mon Nov 27 2023. Hostname: tscc-4-60.sdsc.edu
Accepted options: --bfile /home/1000G/ALL.chr1-22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes --mtcojo-file data_list_scz_cigday.txt --ref-ld-chr /home/gcta/eur_w_ld_chr/ --w-ld-chr /home/gcta/eur_w_ld_chr/ --out mtcojo_scz_cigday
Reading PLINK FAM file from [/home/1000G/ALL.chr1-22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.fam]. 2504 individuals to be included from [/home/1000G/ALL.chr1-22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.fam]. Reading PLINK BIM file from [/home/1000G/ALL.chr1-22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.bim]. 80845844 SNPs to be included from [/home/1000G/ALL.chr1-22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.bim].
Reading GWAS summary data from [data_list_scz_cigday.txt] ... 7341181 SNPs in common between the target trait and the covariate trait(s). Filtering out SNPs with multiple alleles or missing value ... 864 SNPs have missing value or mismatched alleles. These SNPs have been saved in [mtcojo_scz_cigday.badsnps]. 7340317 SNPs are retained after filtering. There are 3888 genome-wide significant SNPs with p < 5.0e-08.
Reading PLINK BED file from [/home/1000G/ALL.chr1-22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.bed] in SNP-major format ... Genotype data for 2504 individuals and 3888 SNPs to be included from [/home/1000G/ALL.chr1-22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.bed]. Calculating allele frequencies ... Checking the difference in allele frequency between the GWAS summary datasets and the LD reference sample... 5478219 SNP(s) have large difference of allele frequency between the GWAS summary data and the reference sample. These SNPs have been saved in [mtcojo_scz_cigday.freq.badsnps]. Error: there are too many SNPs that have large difference in allele frequency. Please check the GWAS summary data. An error occurs, please check the options or data
Could you please help me solve this issue?
Thanks! Shreya