dkw-aau / medic_transformer

Experimental code for using transformer models on patient event sequences
3 stars 0 forks source link

Medic-BERT

This repository contains the experimental code for the Medic-BERT transformer model for predicting the length of hospitalization stays (LOS) for patients based on sequences of medical events.

Given the patients' EHR data as medical events, the program is able to train predictive models for LOS prediction.

Installation

The software is written in python 3.10.

Dependencies can be installed through the requirements.txt file.

Placing data files

Two csv structured EHR data files should be placed in the Data folder. Currently, CSV and PARQUET formats are supported.

patients.csv specifies the patient cohort and should contain the following fields:

sequence_id patient_id age sex hosp_start los

data.csv specifies the EHR of a patient modelled as individual tokenized medical events

token token_orig event_time event_type event_value sequence_id

Example data files can be located in the Data folder.

The program can process input files as the format of CSV and parquet files.

Running the software

main.py is the main entry for training and evaluating models. The behavior of the software depends on the configuration of the config.ini file.

The parameters and their settings are described below. The most important settings are described first, subsequently the ones rarely changed:

Often changed

Rarely changed

If you use M-BERT in a scientific publication, we would like a citation:

@inproceedings{hansen2023patient,
  title={Patient Event Sequences for Predicting Hospitalization Length of Stay},
  author={Hansen, Emil Riis and Nielsen, Thomas Dyhre and Mulvad, Thomas and Strausholm, Mads Nibe and Sagi, Tomer and Hose, Katja},
  booktitle={International Conference on Artificial Intelligence in Medicine},
  pages={51--56},
  year={2023},
  organization={Springer}
}