drivendataorg / concept-to-clinic

ALCF Concept to Clinic Challenge
https://concepttoclinic.drivendata.org/
MIT License
367 stars 146 forks source link

Add function that loads DICOM images #12

Closed pjbull closed 7 years ago

pjbull commented 7 years ago

Overview

All of our models need to take a path to a DICOM image (which is actually a directory of images and XML files) and then load that image into memory.

Expected Behavior

The function should take a path to a DICOM directory and load the data from that directory into a format that will be useful to the models. It will then provide For example DICOM-numpy may be useful here.

This issue is for a first pass implementation. As the models evolve, we may need to update and change the format that this method provides to its callers.

Technical details

Acceptance criteria

NOTE: All PRs must follow the standard PR checklist.

isms commented 7 years ago

Pasting in a random snippet using pydicom that may be helpful :sparkles::sparkles::sparkles:

import os
from glob import glob
import dicom as dc

base_dir = "[INSERT PATH TO DIRECTORY HERE :-D]"
series_instance_uid = '1.2.840.113654.2.55.135088253786049275791463451273034430925'
series_dir = os.path.join(base_dir, series_instance_uid)
pattern = os.path.join(series_dir, '*')
files = sorted([dc.read_file(fn) for fn in glob(pattern)], key=lambda x: float(x.SliceLocation))
arr = np.array([dd.pixel_array for dd in files], dtype=np.int16)
tdraebing commented 7 years ago

I gave this issue some more thought. I think for future tasks it might be helpful to, instead of just handing over an array with the pixel data, create an object that also contains some useful metadata, as well as to be a place where future prediction algorithm can store the position of the nodules. What do you think? I would be happy to refactor the script.