EBISPOT / goci

GWAS Catalog Ontology and Curation Infrastructure
Apache License 2.0
26 stars 19 forks source link

Bioinformatics task - calculating number of signficant variants #1219

Closed ljwh2 closed 6 months ago

ljwh2 commented 6 months ago

I need to know how many significant associations are in some summary statistics files on the public ftp, they are ones that aren't in the standard format so it's a little fiddly to do. There are ~450 files in 3 publications and one submission.

I need to know the total number of associations with p<1E-5.

PMID 35240056 has this header for 171 studies: variant_id p_value chromosome base_pair_location effect_allele frequency_duplication frequency_deletion beta

PMID 36203093 has this header for 4 studies: Locus_id p_value

PMID 36779085 has this header for 78 studies: SNP CHR BP P start end

The one without PMID is https://www.ebi.ac.uk/gwas/deposition/submission/655c68cf788f000001be4398, which has GCST90297568-GCST90297771 It has this header for 204 studies: chromosome base_pair_location effect_allele other_allele beta standard_error effect_allele_frequency p_value rs_id ci_lower ci_upper n

jiyue1214 commented 6 months ago

Finish. Please see the attached file.