EHDEN / ETL-UK-Biobank

ETL UK-Biobank
https://ehden.github.io/ETL-UK-Biobank/
12 stars 4 forks source link

Load UKB baseline data dictionary in database #20

Closed MaximMoinat closed 3 years ago

MaximMoinat commented 3 years ago

The UKB baseline fields are well documented and the data dictionary can be programatically retrieved as txt or xml files (https://biobank.ctsu.ox.ac.uk/crystal/schema.cgi).

It could be beneficial to make a script to load the data dictionary into its own schema in the target database. This can then be used to populate e.g. care sites and retrieve descriptions of fields and values. By retrieving this directly from the published schemas, we can easily refresh this information (e.g. in the case more assessment centers are added or the names updated).

MaximMoinat commented 3 years ago

Linked to #60

MaximMoinat commented 3 years ago

UKB vocabulary from Athena is added to the requirements and the baseline mapping used that as source_concept_id