KitwareMedical / lungair-web-application

Web application based on VolView for AI-based BPD risk analysis
4 stars 0 forks source link

link imaging and medical record data #16

Closed ebrahimebrahim closed 11 months ago

ebrahimebrahim commented 1 year ago

Make the app understand the link between our imaging and our EHR data. Our data tables have a patient ID column that links the entries to our images. In the app, if we are using data and images from our research group, we should be able to select a patient and have the app both load the EHR data for that patient and the images for that patient.

This relies on #9 being completed so that the app has access to EHR data that contains references to image data.

ebrahimebrahim commented 1 year ago

Thoughts on how to link patient identification between imaging and EHR

How it's done in our data currently

EHR

On the EHR side, our data tables have a column called ID. This goes into the local fhir server server (via this and this) as a FHIR business identifier for the Patient resource.

Imaging

On the imaging side, the anonymized CXR data are in a folder hierarchy like this:

filestore_annon/
├── 1
│   └── anon
│       ├── 102_0.dcm
│       ├── 12_0.dcm
│       ├── 14_0.dcm
│       ├── 15_0.dcm
│       ├── 17_0.dcm
│       ├── 18_0.dcm
...

There are many folders like the one named 1 shown here, one for each patient. That top level folder name like 1 is what corresponds to the ID column from the data table.

The folder name is not a good way to communicate patient ID to LungAIR web app, because really LungAIR ought to get everything via the DICOMWeb protocol. The DICOM headers in the individual dcm files do have a patient ID element (tag (0010, 0020)) but it is just a garbled (anonymized) version of the original hospital medical record numbers and no longer corresponds to anything.

Proposed approach

I think that it makes sense for the link to ultimately be (0010, 0020) from DICOM and a FHIR Patient identifier from EHR, with some identifier system that perhaps ought to be configurable by the user of LungAIR web app.

Making our imaging data to be compatible with this approach

So in order to use our own imaging data here, I think that before uploading our imaging data to the local orthanc server we should go through each dcm file and replace the value in the patient id tag (0010, 0020) by the top level folder name for that patient. Here is a python snippet that does that:

from pathlib import Path
import pydicom

cxr_dir = Path("/path/to/filestore_annon/")

for dcm_path in cxr_dir.glob('*'):
    patient_id = dcm_path.stem
    for dcm_file in dcm_path.glob('*/*.dcm'):
        ds = pydicom.dcmread(dcm_file)
        ds.PatientID = patient_id
        ds.save_as(dcm_file)
ebrahimebrahim commented 1 year ago

To close this issue here is the plan: