EBISPOT / gwas-summary-statistics-standard

Documentation on new GWAS Summary Statistics Standard
18 stars 2 forks source link

Example sumstats format #24

Closed project-defiant closed 3 months ago

project-defiant commented 3 months ago

Hello, thank you for the great work on gwas catalog. I have a question around the example of GWAS-SSF provided in this repository - https://github.com/EBISPOT/gwas-summary-statistics-standard/blob/master/examples/0000123.tsv

chromosome  base_pair_location  effect_allele   other_allele    beta    standard_error  effect_allele_frequency p_value variant_id  rsid    ref_allele
1   869388  A   G   -0.016619   0.00806496  0.997221    0.1 1_869388_A_G    #NA EA
1   205813916   G   C   -0.0089589  0.00331941  0.983589    9.7E-03 1_205813916_G_C rs74143855  EA
2   70478797    T   TG  0.0187528   0.00167685  0.934121    3.5E-30 2_70478797_T_TG rs142640435 EA
7   8458030 TC  T   -0.0184003  0.00101051  0.78451 5.7E-76 7_8458030_TC_T  rs774624811 EA
23  24173186    A   C   0.00387762  0.08757958  0.627178    2.3E-08 23_24173186_C_A rs5949233   OA

the last column called ref_allele has some unusual values like EA and OA. Is it intentional or some mistype column name. I can not find this column as one of the Mandatory fields or Encouraged fields in the article describing the format.

ljwh2 commented 3 months ago

Hi @project-defiant, thanks for pointing this out. You can find the definition of the ref_allele column described in the specifications here https://github.com/EBISPOT/gwas-summary-statistics-standard/blob/master/gwas-ssf_v1.0.0.pdf and in the GWAS Catalog submission documentation https://www.ebi.ac.uk/gwas/docs/summary-statistics-format, but we had overlooked this in the manuscript. We will update that as soon as possible.

project-defiant commented 3 months ago

I see, thank you for explaining it!