EBISPOT / goci

GWAS Catalog Ontology and Curation Infrastructure
Apache License 2.0
26 stars 19 forks source link

New version of studies download #1112

Closed ljwh2 closed 10 months ago

ljwh2 commented 1 year ago

As a user I want to have access to cohort and summary statistics location data via the studies downloads, available from this page: https://www.ebi.ac.uk/gwas/docs/file-downloads Note, cohort data is entered by curators and submitters via the submission template, and is imported to the curation app, e.g. on this page: https://www.ebi.ac.uk/gwas/curation/studies/106775325: Private Zenhub Image

There should be a two new versions All studies v1.0.2.1 and All studies v1.0.3.1

All studies v1.0.2.1 Same as All studies v1.0.2 with three additional columns: “COHORT”, containing cohort identifiers. For consistency with other fields, the separator should be a comma, but could be pipe (this is how it is entered in the submission template) if this is difficult to do. “FULL SUMMARY STATISTICS”, containing information on whether the study has sumstats. This can be obtained from the fullpvalueset flag “SUMMARY STATS LOCATION” containing the ftp path to the sumstats

All studies v1.0.3.1 Same as All studies v1.0.3 with two additional columns “COHORT” and “FULL SUMMARY STATISTICS” This file already has the “SUMMARY STATS LOCATION”, but it is not currently populated. The field should be populated. (Note the “submission date” field is also not currently populated, if easy to do, this could be populated as well)

The new fields need adding to solr if not already and included in the DR

ala-ebi commented 12 months ago

empty cohort: leave blank empty ftplink: NA

ala-ebi commented 11 months ago

All studies v1.0.3.1: instead of fixing "SUMMARY STATS LOCATION" and adding two fields, I removed the existing "SUMMARY STATS LOCATION" from the middle and added it to the end of the file so it's consistent with v1.0.2.1 and we can reuse the same code block.

ala-ebi commented 11 months ago

@ljwh2 or @earlEBI please update the documentation accordingly https://www.ebi.ac.uk/gwas/docs/fileheaders

ljwh2 commented 11 months ago

Final list of files is here: https://docs.google.com/spreadsheets/d/1lMkVVueCb_-_Csdg57lLBJG5hdXeDLNESfvUjDqv7NA/edit#gid=0

ala-ebi commented 11 months ago

unpublished studies' cohorts are not stored in STUDY_EXTENSION and should be handled separately