dotimplement / HealthChain

Simplify testing and validating AI and NLP applications in a healthcare context 💫 🏥
https://dotimplement.github.io/HealthChain/
Apache License 2.0
25 stars 16 forks source link

Enhance Medication Value Sets for More Realistic Data Generation Description #69

Open jenniferjiangkells opened 1 month ago

jenniferjiangkells commented 1 month ago

Description

Improve MedicationRequestMedication value set to generate more realistic and comprehensive medication data. Value sets are currently SNOMED CT codes in Virtual Therapeutic Moiety (medicinal product) form.

The current implementation provides a basic list of common medications (generated by ChatGPT), but it has not been verified and may lack the depth and variety needed to simulate more realistic real-world data.

https://github.com/dotimplement/HealthChain/blob/a4beb13020a66fe3b2d1b5555433dfbf0d5480f3/healthchain/data_generators/value_sets/medicationcodes.py#L11-L43

Context

Realistic medication data is crucial for:

  1. Testing clinical decision support systems
  2. Simulating diverse patient populations
  3. Ensuring our generated data covers a wide range of medical scenarios
  4. Improving the overall quality and usefulness of our synthetic healthcare data
  5. Make the generated data more valuable for testing and development purposes.

Possible Implementation

  1. Expand the current list of medications to include a broader range of drugs across various therapeutic categories.
  2. Add additional attributes to each medication entry, such as:
    • Dosage forms (e.g., tablet, capsule, injection)
    • Typical dosage strengths
    • Route of administration
  3. Include less common medications to represent more specialised treatments.
  4. Implement a weighting system to reflect the relative frequency of prescription for each medication.
  5. Add other code systems or extension systems - currently all codes are SNOMED CT UK edition verified by the SNOMED CT browser https://termbrowser.nhs.uk/

Example of an enhanced medication entry:

{
    "code": "774656009",
    "display": "Aspirin",
    "dosage_forms": ["tablet", "capsule"],
    "strengths": ["81 mg", "325 mg"],
    "route": "oral",
    "frequency_weight": 0.8
}
Aryanil-codes commented 1 month ago

Can i get assigned this issue? And do i need to be a certified doctor working on this? or if I can google well enough and have a basic understanding is that alright too? Thanks

deevyanshoo commented 1 month ago

Is this already assigned? I can work on this, I have prior knowledge about healthcare domain

jenniferjiangkells commented 1 month ago

Hi @deevyanshoo @Aryanil-codes - first of all thanks for both your interest in contributing! ⭐ This issue is more about the curation of clinical knowledge and is more suited to people with domain knowledge. However we could really do with some help on improving the structure and configuration of the data classes as well, so I will open these as separate issues and tag you both in it if you're still interested in working on this! Just comment on the issues to let us know that you've started working on it. 😄

jenniferjiangkells commented 1 month ago

@deevyanshoo @Aryanil-codes #75