OxfordIHTM / ihtm-hackathon-2024

Oxford IHTM Hackathon 2024
1 stars 0 forks source link

Oxford International Health and Tropical Medicine Hackathon 2024

License for
data License for
code check overall
workflow

This repository contains instructions, data and code for the University of Oxford MSc in International Health and Tropical Medicine Hackathon 2024. The Hackathon 2024 event is part of the MSc course’s lecture series on Open Science and Reproducible Research in R.

Motivation

Hackathon 2024 caps the students’ introduction to Open Science and Reproducible Research in R through an actual global health project within which they are to serve as researchers/data scientists. This exercise aims to provide the students a platform from which to apply skills in working with data using R that they have been learning and practising for about the past 6 weeks while at the same time exposing them to a collaborative team environment.

Format

Hackathon 2024 is structured as a problem-based learning exercise, a format that the course students are already familiar with given similar approaches done for other lectures. Briefly, this PBL exercise presents the problem first rather than teaching relevant material and subsequently having students apply the knowledge to solve the problem. Whilst the previous lectures in the Open Science and Reproducible Research in R series have provided foundational skills in R, the PBL approach for this hackathon will challenge the students to further explore and learn the extensive functionalities R has to offer in order to appropriately solve the problem/s they have been given to solve. This PBL is group-orientated and simulates a collaborative research/data science working environment facilitated through the use of git and GitHub.

Through this approach, the students are expected to:

  1. Examine and define the problem.

  2. Explore what they already know about underlying issues related to it.

  3. Determine what they need to learn and where they can acquire the information and tools necessary to solve the problem.

  4. Evaluate possible ways to solve the problem.

  5. Solve the problem.

  6. Report on their findings.

The case

The case study along with the hackathon rules are presented here - https://oxford-ihtm.io/ihtm-hackathon-2024/case_study.html.

Repository/project structure

This R project has the following structure:

ihtm-hackathon-2024
    |-- data/
    |-- docs
      |-- case_study.html
    |-- outputs/
    |-- packages.R
    |-- R/
    |-- reports
      |-- case_study.Rmd
      |-- sudan_health_nutrition.Rmd
    |-- sudan_health_nutrition_1.R
    |-- sudan_health_nutrition_2.R
    |-- sudan_health_nutrition_3.R
    |-- sudan_health_nutrition_4.R
    |-- sudan_health_nutrition_5.R
    |-- sudan_health_nutrition_6.R
    |-- sudan_health_nutrition.R

Reproducibility

This project is built on R version 4.3.2.

To work on this project, please follow these steps:

  1. Clone this project onto your local machine. Instructions on how this is done can be found here.

  2. In your local clone of the project, please make sure to create a new branch from the main branch. You should name this branch in such a way that uniquely identifies it as your personal branch (i.e., give it your name). Please avoid blank spaces in branch names. If you need to put a space, use a - or a _.

  3. Install all declared R package dependencies found in packages.R.

    • Check if the packages listed in packages.R are already installed using the following code:
    installed.packages() |>
      (\(x) x[ , 1])() |>
      (\(x) x[c("name_of_package1", "name_of_package2", "name_of_package3")])()

    Please run this code direct to your R console rather than encoding in the R scripts in this project.

    Please make sure to replace the placeholder text name_of_package1 etc with the actual names of the packages listed in the packages.R file.

    This code will show which of the packages listed in packages.R are already installed in your computer. Packages listed in packages.R that are not shown in the output of the code above are the packages that are not yet installed in your computer.

    • Install the packages that are not yet installed using the following code:
    install.packages(c("name_of_package1", "name_of_package2", "name_of_package3"))

    Please run this code direct to your R console rather than encoding in the R scripts in this project.

    Please make sure to replace the placeholder text name_of_package1 etc with the actual names of the packages listed in the packages.R file that are not yet installed in your computer.

Once all R package dependencies have been installed, you should now be able to work on this project on your own branch and make changes/contributions as directed by project lead.

Running the workflow

Running the entire workflow

To run the entire workflow, issue the following command onto R console:

source("sudan_health_nutrition.R")

Running specific sections of the workflow

The project workflow is currently divided into 6 discrete processes implemented in 6 different R scripts labelled:

To run any of these, issue the following commands in the R console:

# Setup the workflow environmen ----

## Load packages in packages.R and project-specific functions in R folder ---- 
suppressPackageStartupMessages(source("packages.R"))
for (f in list.files(here::here("R"), full.names = TRUE)) source (f)

## Read data ----
maternal <- read.csv("data/maternal_health.csv")
child <- read.csv("data/child_health.csv")
cmam <- read.csv("data/cmam_routine_data.csv")

### Retrieve and read Sudan map data ----
sudan_map_spec <- download_sudan_maps(download_url = "https://data.humdata.org/dataset/a66a4b6c-92de-4507-9546-aa1900474180/resource/e5ef3cc7-f105-4565-8d73-e08bb756f1c1/download/sdn_adm_cbs_nic_ssa_20200831.gdb.zip")
sudan_map_url <- "https://github.com/spatialworks/sudan/raw/master/data-raw/maps/sudan.gpkg"

sudan0 <- st_read(dsn = sudan_map_spec$dsn, layer = sudan_map_spec$layers[1])
sudan1 <- st_read(dsn = sudan_map_url, layer = "state")
sudan2 <- st_read(dsn = sudan_map_url, layer = "locality")

## Run the specific workflow ---
source("sudan_health_nutrition_1.R")

Reproducing the final report for this project

To reproduce the HTML final report for this project, run the following command on the R console:

rmarkdown::render(
  "reports/sudan_health_nutrition.Rmd", 
  output_dir = "docs", 
  knit_root_dir = here::here()
)

This will render the Rmarkdown found in the reports directory and produce an HTML report called “sudan_health_nutrition.html” in the docs directory.

Authors

License

Unless otherwise specified, data used in this repository are licensed under a CC0 1.0 Universal license.

All code in this repository are licensed under a GNU General Public License 3 (GPL-3) license.