3mcloud / MDACE

The project was to build and release the first publicly available code evidence dataset called MDACE on a subset of the MIMIC-III clinical records. We believe that the release of MDACE will greatly improve the understanding and application of deep learning technologies for medical coding, and benefit machine learning models for classification.
MIT License
23 stars 2 forks source link

MDACE: MIMIC Documents Annotated with Code Evidence (ACL 2023)

This is the official repository for the dataset paper
Title: MDACE: MIMIC Documents Annotated with Code Evidence
Authors: Hua Cheng, Rana Jafari, April Russell, Russell Klopfer, Edmond Lu, Benjamin Striner, Matthew R. Gormley
URL: https://aclanthology.org/2023.acl-long.416/
arXiv: https://arxiv.org/abs/2307.03859

MIMIC Documents Annotated with Code Evidence is the first publicly available dataset for evidence/rationale extraction on an extreme multi-label classification task over long medical documents. It is built on a subset of the MIMIC-III clinical records, and was annotated by professional medical coders. MDACE consists of 302 Inpatient charts with 3,934 evidence spans and 52 Profee charts with 5,563 evidence spans.

Usage

This repo includes basic preprocessing and tokenization as well as an implementation of the metrics reported in the paper.

  1. After gaining access to MIMIC III, download NOTEEVENTS.csv to this directory (unzip if necessary).

  2. Inject text into our JSON files and write them to with_text/gold/

    python3 inject-note-text.py --noteevents NOTEEVENTS.csv \
                                --data-dir data/ \
                                --out-dir with_text/gold
  3. Train your model and generate predictions in the same format.

  4. Inject text into predicted JSON files and write them to with_text/predictions/

    python3 inject-note-text.py --noteevents NOTEEVENTS.csv \
                                --data-dir predictions/ \
                                --out-dir with_text/predictions
  5. Compute metrics for predictions given a split file and note category.

    • Supervised Attention

      python3 evaluate-predictions.py --split-file splits/Inpatient/MDace-ev-test.csv \
                                      --note-category "Discharge summary" \
                                      --gold-dir with_text/gold/Inpatient/ICD-9/1.0 \
                                      --predictions-dir with_text/predictions/supervised_attn/Inpatient/ICD-9/Discharge \
                                      --trim-annotations \
                                      --md-out results/supervised_attn/Inpatient/ICD-9/Discharge/results.md
    • Supervised Attention All Codeable Docs

      python3 evaluate-predictions.py --split-file splits/Inpatient/MDace-ev-test.csv \
                                      --note-category "Discharge summary" \
                                      --note-category "Radiology" \
                                      --note-category "Physician" \
                                      --gold-dir with_text/gold/Inpatient/ICD-9/1.0 \
                                      --predictions-dir with_text/predictions/supervised_attn/Inpatient/ICD-9/Codeable \
                                      --merge-adjacent \
                                      --trim-annotations \
                                      --md-out results/supervised_attn/Inpatient/ICD-9/Codeable/results.md
    • Unsupervised Atention

      python3 evaluate-predictions.py --split-file splits/Inpatient/MDace-ev-test.csv \
                                      --note-category "Discharge summary" \
                                      --gold-dir with_text/gold/Inpatient/ICD-9/1.0 \
                                      --predictions-dir with_text/predictions/unsupervised_attn/Inpatient/ICD-9/Discharge \
                                      --trim-annotations \
                                      --md-out results/unsupervised_attn/Inpatient/ICD-9/Discharge/results.md

Splits

The splits directory contains the lists of MIMIC-III HADM IDs that were used to run our experiments.

img.png

Data

In addition to the MDACE dataset (data/), we also provide predictions from a few experiments in the paper (predictions/).

Directory Structure

MDACE
├── data
│   ├── <Inpatient or Profee>
│   │   ├── <ICD-10 or ICD-9>
│   │   │   └── <version>
│   │   │       ├── *.json
├── predictions
│   ├── <experiment name>
│   │   └── <Inpatient or Profee>
│   │       └── <ICD-10 or ICD-9>
│   │           ├── *.json

Format

One file per chart (aka admission)

<hadmid>.json

{
    "hadm_id": 380123,
    "comment": "EGD report not present- unable to validate the PCS codes"
    "notes": [
        {
            "note_id": 1234,
            "category": "Discharge Summary",
            "description": "Report",
            "annotations": <List of Annotations>
        },
        ...

    ]
}

<List of Annotations>

[
    {
        "begin": 0,
        "end": 3,
        "code": "I10",
        "code_system": "ICD-10-CM",
        "description": "Essential (primary) hypertension",
        "type": "Human"
    },
    ...
]

Code Mapping Types

Cite this Work

@inproceedings{cheng-etal-2023-mdace,
    title = "{MDACE}: {MIMIC} Documents Annotated with Code Evidence",
    author = "Cheng, Hua  and
      Jafari, Rana  and
      Russell, April  and
      Klopfer, Russell  and
      Lu, Edmond  and
      Striner, Benjamin  and
      Gormley, Matthew",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-long.416",
    pages = "7534--7550",
    abstract = "We introduce a dataset for evidence/rationale extraction on an extreme multi-label classification task over long medical documents. One such task is Computer-Assisted Coding (CAC) which has improved significantly in recent years, thanks to advances in machine learning technologies. Yet simply predicting a set of final codes for a patient encounter is insufficient as CAC systems are required to provide supporting textual evidence to justify the billing codes. A model able to produce accurate and reliable supporting evidence for each code would be a tremendous benefit. However, a human annotated code evidence corpus is extremely difficult to create because it requires specialized knowledge. In this paper, we introduce MDACE, the first publicly available code evidence dataset, which is built on a subset of the MIMIC-III clinical records. The dataset {--} annotated by professional medical coders {--} consists of 302 Inpatient charts with 3,934 evidence spans and 52 Profee charts with 5,563 evidence spans. We implemented several evidence extraction methods based on the EffectiveCAN model (Liu et al., 2021) to establish baseline performance on this dataset. MDACE can be used to evaluate code evidence extraction methods for CAC systems, as well as the accuracy and interpretability of deep learning models for multi-label classification. We believe that the release of MDACE will greatly improve the understanding and application of deep learning technologies for medical coding and document classification.",
}