rmgpanw / ukbwranglr

R package for UK Biobank data wrangling.

https://rmgpanw.github.io/ukbwranglr/

Other

14 stars 1 forks source link

package r uk-biobank

readme

ukbwranglr

Overview

The goal of ukbwranglr is to facilitate analysing UK Biobank data, including:

Reading a selection of UK Biobank variables into R.
Summarising repeated continuous variable measurements.[^1]
Extracting phenotypic outcomes of interest from clinical events data.[^2]

Installation

You can install the development version of ukbwranglr with:

# install.packages("devtools")
devtools::install_github("rmgpanw/ukbwranglr")

Basic workflow

The basic workflow is as follows:

Create a data dictionary for your main UK Biobank dataset with make_data_dict().
Read selected variables into R with read_ukb().
Summarise continuous variables with summarise_numerical_variables().
Tidy clinical events data with tidy_clinical_events() or make_clinical_events_db(), and extract outcomes of interest with extract_phenotypes().
Analyse.

Please see vignette('ukbwranglr') for further details.

[^1]: For example, calculating a mean/minimum/maximum body mass index (BMI) from repeated BMI measurements.

[^2]: For example, identifying participants with a diagnosis of hypertension from linked primary and secondary health care records.