EBISPOT / gwas-sumstats-harmoniser

GWAS Summary Statistics Data Harmonisation
18 stars 9 forks source link

pipeline reads build from metadata #49

Closed jdhayhurst closed 1 year ago

jdhayhurst commented 1 year ago

Enable the pipeline to read the genome assembly from a metadata file

jdhayhurst commented 1 year ago

the parsing of the build from the filename will be changed to parsing the build from the metadata. Example files to follow.

jdhayhurst commented 1 year ago

example metadata file. GCST90012345.tsv.gz-meta.yaml

GWASCatalogAPI: "https://www.ebi.ac.uk/gwas/rest/api/studies/GCST90012345"
GWASID: GCST90012345
authorNotes: "author provided text"
dataFileMd5sum: 0b6d30452027a49fad0c86b77ab4477c
dataFileName: GCST90012345.tsv.gz
fileType: "GWAS-SFF v0.1"
genomeAssembly: GRCh37
genotypingTechnology: 
  - "Genome-wide genotyping array"
sampleAncestry: 
  - European
  - "East Asian"
sampleSize: 18096
traitDescription: 
  - "myocardial fractal dimension slice 6"
jiyue1214 commented 1 year ago

@jiyue1214 Make the original yaml file available in the qc step.