coderxio / medication-diversification

More realistic synthetic medication data.
Other
13 stars 3 forks source link
meps rxclass rxnorm synthea synthetic-data

Medication Diversification Tool

The Medication Diversification Tool (MDT) leverages publicly-available, government-maintained datasets to enhance Synthea’s Synthetic Patient Generator. The synthetic health data generated by Synthea can be used by researchers, software developers, policymakers, and clinicians to develop healthcare solutions. In its current state, the process for generating medications in Synthea is manual and limited to a small selection of medications in individual modules. The goal for the MDT is to create more diverse synthetic patient medication orders that accurately reflects the heterogeneity of medications being prescribed in the US population.

The MDT automates the process for finding relevant medication codes and calculating a distribution of medications, using medication classification dictionaries from RxClass and population-level prescription data from the Medical Expenditure Panel Survey (MEPS). The medication distributions can be tailored to specific patient demographics (e.g., age, gender, state of residence) and combined with Synthea data to generate medication records for a sample patient population.

Developer quickstart

  1. Clone the repo.
    git clone https://github.com/coderxio/medication-diversification.git
    cd medication-diversification
  2. Create and activate a venv.
    python -m venv venv
    source venv/bin/activate

    Or on Windows (using Git Bash):

    py -m venv venv
    venv/scripts/activate

    If using VSCode on Windows and getting error "Activate.ps1 is not digitally signed. You cannot run this script on the current system.", then you may need to temporarily change the PowerShell execution policy to allow scripts to run. If this is the case, try Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope Process and then repeat step 2.

  3. Install MDT as an installed editable package (note the . after -e).
    pip install -e .
  4. Change to a new directory outside of the medication-diversification/ project folder to test out MDT.
    cd ..
    mkdir mdt-test
    cd mdt-test
  5. Initialize MDT. This only needs to be done once. This will create a data/ directory and load the MDT.db database.
    mdt init
  6. Create a new module. This will create a <<module_name>>/ directory which is empty except for an initial settings.yaml file.
    mdt module -n <<module_name>> create
  7. Edit the settings.yaml folder in the newly created <<module_name>>/ directory, following the directions in this README.
  8. Build the module.
    mdt module -n <<module_name>> build

    This will create:

    • A <<module_name>>.json file which is the Synthea module itself
    • A lookup_tables/ directory with all transition table CSVs
    • A log/ directory with helpful output logs and debugging CSVs

      Repeat steps 7 and 8 until MDT is producing medications that align with what you would expect. Use the log <<timestamp>>.txt files in the log/ directory as a quick and easy way to validate the output of the module with a clinical subject matter expert.

To create a new module, start at step 6.

User-defined settings

Pre-built module settings file examples available in the docs/examples folder.

Module settings

Setting Type Description
name string (optional) The name of your module. Defaults to the camel_case name of the module folder. Also used as assign_to_attribute property by default.
assign_to_attribute string (optional) The name of the "attribute" to assign this state to. Defaults to <<module_name>>.
reason string (optional) Either an "attribute" or a "State_Name" referencing a previous ConditionOnset state.
chronic boolean (optional) If true, a medication is considered a chronic medication for a chronic condition. This will cause Synthea to reissue the same medication as a new prescription AND discontinue the old prescription at each wellness encounter. Defaults to false.
as_needed boolean (optional) If true, the medication may be taken as needed instead of on a specific schedule. Defaults to false.
refills integer (optional) The number of refills to allow. Defaults to 0.

RxClass settings

NOTE: At least one RxClass include or RXCUI include is required to run MDT.

Setting Type Description
include list of objects class_id / relationship pairs of RxClass classes to include. See RxClass for valid options.
exclude list of objects class_id / relationship pairs of RxClass classes to exclude. See RxClass for valid options.

Examples:

NOTE: All yaml keys in the default generated settings file must be present even if the key value is empty, this will be adjusted in a future version of MDT to set appropriate default values if a key is omitted.

rxclass:
  include:
    - class_id: R01AD
      relationship: ATC
  exclude: <-- Required key, read as an empty array
    # -

Corticosteroid medications

rxclass:
  include:
    - class_id: R01AD
      relationship: ATC
  exclude:

Medications that may treat hypothyroidism

rxclass:
  include:
    - class_id: D007037
      relationship: may_treat
  exclude:

HMG CoA reductase inhibitor medications AND medications that may prevent stroke

rxclass:
  include:
    - class_id: R01AD
      relationship: ATC
    - class_id: D020521
      relationship: may_prevent
  exclude:

Medications that may prevent stroke EXCLUDING P2Y12 platelet inhibitors

rxclass:
  include:
    - class_id: D020521
      relationship: may_prevent
  exclude:
    - class_id: N0000182142
      relationship: has_EPC

RXCUI settings

NOTE: At least one RxClass include or RXCUI include is required to run MDT. RXCUIs in the include and exclude sections must be surrounded by single quotation marks.

Setting Type Description
include list of strings RXCUIs to include. See ingredients section of RxNav for valid options.
exclude list of strings RXCUIs to exclude. See ingredients section of RxNav for valid options.
ingredient_tty_filter string (optional) IN to only return single ingredient products or MIN to only return multiple ingredient products.
dose_form_filter list of strings (optional) A list of dose forms or dose form group names to filter products by. See this RxNorm dose form reference for valid options.

Examples:

Prednisone medications

rxcui:
  include:
    - '8640'
  exclude:

Albuterol AND levalbuterol medications

rxcui:
  include:
    - '435'
    - '237159'
  exclude:

Fluticasone / salmeterol (TTY = MIN, multiple ingredient) medications

rxcui:
  include:
    - '284635'
  exclude:

Single ingredient inhalant product fluticasone medications only

rxcui:
  include:
    - '41126'
  exclude:
ingredient_tty_filter: IN
dose_form_filter:
  - Inhalant Product

MEPS settings

Setting Type Description
age_ranges list of strings Age ranges to break up distributions by. Defaults to MDT system defaults.
demographic_distribution_flags object Whether to break up distributions by age, gender, and state. All three default to true.

Examples:

Custom age ranges for pediatric patients only

meps:
  age_ranges:
    - 0-5
    - 6-12
    - 13-17

Split population under and over 65 years old

meps:
  age_ranges:
    - 0-64
    - 65-103

How to replace a MedicationOrder with a MDT submodule

To replace a MedicationOrder with one of our MDT submodules, replace the MedicationOrder state with a CallSubmodule state.

"Medication_Submodule": {
  "type": "CallSubmodule",
  "submodule": "medications/<<name_of_your_mdt_submodule_here_without_json_file_extension>>"
}

Put the submodule JSON file in the synthea/src/main/resources/modules/medications folder.

Put your transition table CSV files in the synthea/src/main/resources/modules/lookup_tables folder.

Example for asthma module:

Using the existing asthma module as an example...

Change this...

...
    "Prescribe_Maintenance_Inhaler": {
      "type": "MedicationOrder",
      "reason": "asthma_condition",
      "codes": [
        {
          "system": "RxNorm",
          "code": "895994",
          "display": "120 ACTUAT Fluticasone propionate 0.044 MG/ACTUAT Metered Dose Inhaler"
        }
      ],
      "prescription": {
        "as_needed": true
      },
      "direct_transition": "Prescribe_Emergency_Inhaler",
      "chronic": true
    },
...

To this...

...
    "Prescribe_Maintenance_Inhaler": {
      "type": "CallSubmodule",
      "submodule": "medications/maintenance_inhaler",
      "direct_transition": "Prescribe_Emergency_Inhaler"
    },
...

And make sure your submodule JSON and transition table CSVs are in the folder locations specified above.

See below for example file structure:

synthea/
├─ src/
│  ├─ main/
|  │  ├─ resources/
|  │  │  ├─ modules/
|  │  │  │  ├─ medication/
|  │  │  │  │  ├─ maintenance_inhaler.json
|  │  │  │  │  ├─ ...
|  │  │  │  ├─ lookup_tables/
|  │  │  │  │  ├─ maintenance_inhaler_ingredient_distribution.csv
|  │  │  │  │  ├─ maintenance_inhaler_fluticasone_product_distribution.csv
|  │  │  │  │  ├─ maintenance_inhaler_budesonide_product_distribution.csv
|  │  │  │  │  ├─ maintenance_inhaler_beclomethasone_product_distribution.csv
|  │  │  │  │  ├─ maintenance_inhaler_mometasone_product_distribution.csv
|  │  │  │  │  ├─ ...
|  │  │  │  ├─ asthma.json
|  │  │  │  ├─ ...

Lastly, if the calling module (in this case, asthma.json) ends medications by a specific State_Name of a previous MedicationOrder state, you will need to change that MedicationEnd state to instead end a medication by attribute. The reason for this is that our MDT JSON module generates different MedicationOrder state names for each potential prescribed product, but they all have the same attribute.

Change this...

...
    "Maintenance_Medication_End": {
      "type": "MedicationEnd",
      "medication_order": "Prescribe_Maintenance_Inhaler",
      "direct_transition": "Emergency_Medication_End"
    },
...

To this...

...
    "Maintenance_Medication_End": {
      "type": "MedicationEnd",
      "referenced_by_attribute": "maintenance_inhaler",
      "direct_transition": "Emergency_Medication_End"
    },
...

Tips on testing MDT with Synthea

Validation

Please see docs/validation for a python notebook which can be used to validate Synthea + MDT patient populations against MEPS patient populations.