opensafely / ics-research

This is the code and configuration for our paper, Inhaled corticosteroid use and risk COVID-19 related death among 966,461 patients with COPD or asthma
https://opensafely.org
0 stars 1 forks source link

Inhaled corticosteroids and risk of COVID-19 death

This is the code and configuration for our paper, Risk of COVID-19-related death among patients with chronic obstructive pulmonary disease or asthma prescribed inhaled corticosteroids: an observational cohort study using the OpenSAFELY platform

About the OpenSAFELY framework

The OpenSAFELY framework is a new secure analytics platform for electronic health records research in the NHS.

Instead of requesting access for slices of patient data and transporting them elsewhere for analysis, the framework supports developing analytics against dummy data, and then running against the real data within the same infrastructure that the data is stored. Read more at OpenSAFELY.org.

The framework is under fast, active development to support rapid analytics relating to COVID19; we're currently seeking funding to make it easier for outside collaborators to work with our system. You can read our current roadmap here.

More Information for developers and epidemiologists interested in the code

For statisticians

This repository contains everything needed to:

You can use it as a template when you create a new Github Repo. When you do so, you should also add two Secrets to the settings for your repo:

The entrypoint of your model must be called model.do and it must live in the analysis/ folder.

Your model must start by importing the dataset, which will be called input.csv and be in the same folder.

For portability, the recommended way of starting your model is:

import delimited `c(pwd)'/output/input.csv

Defining covariates

At the moment, this involves writing some simple Python code.

This must live in a file at analysis/study_definition.py. Until more documentation is written, refer to the sample one provided here for inspiration.

Generating dummy data

On Windows

You'll want to install a couple of things:

You need to obtain the "database URL", which includes a username and password. When running outside the secure environment, obtain a URL that gives you access to the publicly-available dummy dataset.

Now double-click run.exe, and it will use your covariate definitions in analysis/study_definition.py to generate a data file at analysis/input.csv

You can now use Stata as you usually would, with your code entrypoint in analysis/model.do.

Using docker and the command line

Python 3.8 is assumed:

Using plain python

Running the model

There are three ways to run your model:

For the last option, you will need to provide docker with credentials to access the Docker version of Stata (it's password-protected as it includes licensed software).

We use the Github Docker package repository, so you'll need to add a Personal Access Token with permissions to read packages. visit your personal Github "settings" page, find the Developer > Developer Settings > Personal Access Tokens, and add a token there (any name will do) with the permission read:packages. Take a note of the token (you only get a chance to see it once!).

Now run

docker login docker.pkg.github.com -u <YourGithubUsername> --password <PersonalAccessToken>

You can check this worked by running

docker pull docker.pkg.github.com/ebmdatalab/stata-docker-runner/stata-mp:latest

For developers

Run tests

Note: until we make this cleaner... if you change the database schema be sure to docker rm stata-docker_sql_1 before restarting.