EBISPOT / xgwas-curator-tasks

An internal repo for GWAS curators to track issues
0 stars 0 forks source link

Split studies and summary statistics for PMID:30104761 #26

Open eks-ebi opened 11 months ago

eks-ebi commented 11 months ago

This publication was curated some time ago, and split into 4 studies because associations were only reported in the paper for these 4 studies. However, summary statistics were later obtained manually from Open Targets and added.

Three studies correctly have only the one set of summary statistics matching the trait analysed in that study:

However, for the remaining study (GCST008370 = coronary artery disease), the FTP folder contains up to 1,403 summary statistics files for many different traits which were also analysed as part of the project. The plan was to split these into separate entries once the deposition app was in production, but this has not yet been done.

Tasks:

earlEBI commented 6 months ago

The original GCSTs will also need to be kept which will require developer time well before a data release is due to start.

earlEBI commented 6 months ago

@ljwh2 I don't think I can find the key anywhere in the paper that explains what each sumstats filename relates to... I found this file online. Do you think I can use this for mapping and for sample case control numbers? https://github.com/opentargets/genetics-sumstat-data/blob/master/extras/prepare_uk_biobank_gwas_catalog/SAIGE/1_stream_to_gcs/manifest/SAIGE_UKBiobank_URL_Nov2017_v3.tsv

Our sumstats files: http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST008001-GCST009000/GCST008370/?C=N;O=D

earlEBI commented 4 months ago

PMID is currently closed for submission as it was previously curated through the Curation App. Status needs to be changed to 'open for submission' well before a DR starts, so that I can submit the new studies through the submission app.

earlEBI commented 4 months ago

To do:

I think I'll keep all the original files (.gz and .tbi) in GCST008370 directory for provenance. They will remain with raw names to differentiate.

earlEBI commented 3 months ago

This ticket can be closed @ljwh2

ljwh2 commented 1 month ago

We previously agreed not to reformat these files since the effect allele was not clear, but the author has now confirmed (via gwas-info) that ALT=effect so we could format them to GWAS-SSF and harmonise, re-opening the ticket