The segmentations used in this project represent a pair of the type tumor - ablation for liver cancer tumors. The ablation can be replaced with another type of segmentation, such as a predicted tumor segmentation (using a separate algorithm).
The evaluation metrics include:
The following non-standard libraries are required to use the full functionality of the project for DICOM image reading and processing.
A_read_files_info.py
(optional)
DicomReader.py
B_ResampleSegmentations.py
C_mainDistanceVolumeMetrics.py
VolumeMetrics.py
DistanceMetrics.py
scripts/ellipsoid_inner_outer.py
scripts/plot_ablation_margin_hist.py
D_compile_population_radiomics.py
(optional)
E_radiomics_stats.py
(optional)
The scripts are called (internally) in alphabetical order using the folllowing logic:
Read Images --> Resample --> Extract Distance Metrics --> Extract Volume Metrics --> Plot Distance Metrics --> Output Metrics to Xlsx file in Tabular format
PyRadiomics
automatically checks if the source CT image and the derived mask have the same dimensions. If not, resampling is performed in the background.
This function only operates with the path to the patient folder that can have all source CT image, segmentation masks, other files, all in one folder. It does that by creating a dictionary of paths based on the metadata information that was encoded in the ReferencedImageSequenceTag and SourceImageSequence. I have previously encoded the mapping information (which is also an anonymization pipeline) in the DICOM-Anonymization-Segmentation-Mapping
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--rootdir", required=False, help="path to the patient folder to be processed")
ap.add_argument("-o", "--plots_dir", required=True, help="path to the output images")
ap.add_argument("-b", "--input_batch_proc", required=False, help="input csv file for batch processing")
args = vars(ap.parse_args())
The main function where to run the program from is A_read_files_info.py
.
The function can either work with a single patient image folder by calling the function like:
python A_read_files_info.py --i "C:\Users\MyUser\MyPatientFolderwithDicomAndSegmentationImages --o "C:\OutputFilesandImages"
For Batch Processing option the input is an Excel (.xlsx) file with the following headers: | Patient_ID | Ablation_IR_Date | Nr_Lesions | Patient_Dir_Paths |
---|---|---|---|---|
C001 | 20160103 | 2 | ['D:\Users\User1\Pats\Pat_C001'] | |
C002 | 20181108 | 1 | ['D:\Users\User1\Pats\Pat_C002'] |
The algorithm starts by iterating through all the patient folders provided in the column "Patient_Dir_Paths". Of course it's not absolutely necessarry to use the data structured in the way I did. Moreover, the A_read_files_info.py
can be skipped altogether, especially when you know the mapping between your Source (original) CT image -> Segmentation1 -> Segmentation2. In this specific case my Segmentation1 is called "tumor_segmentation" and Segmentation2 is called "ablation_segmentation", which are 2 separate structures/tissue within my organ of interest (that's the liver). If you already know the file mapping between your images (i.e. which image is related to which image, aka more explicit, from which CT source image comes each segmentation) you can move the next steps which are B_ResampleSegmentations.py
and C_mainDistanceVolumeMetrics.py
.
In my case resampling was necessary because not only my 2 segmentation masks had different sizes [x, y, z] , but they also had different spacings in-between the slices. If your images differ only in size then you can use the SimpleITK Paste Image
function. However, if the spacing differs, resampling is necesarry otherwise we cannot compare the 2 images using the Distance and Volume metrics.
The file that does the resampling and resizing job is B_ResampleSegmentations.py
. First the images need to be read and loaded using the SimpleITK
Python library. My segmentations and CT images were saved in multiple DICOM slices (images) in a DICOM folder. The script DicomReader.py
reads all the DICOM slices of an image/segmentation and returns a single variable which is an object in SimpleITK format. The segmentation masks are resampled in the same dimensions and spacing (however still anisotropic) before calling the DistanceMetrics.py
and VolumeMetrics.py
.
This resampling script can be called like:
import SimpleITK as sitk
import DicomReader as Reader
tumor_segmentation_sitk, tumor_sitk_reader = Reader.read_dcm_series(tumor_path, True)
ablation_segmentation_sitk, ablation_sitk_reader = Reader.read_dcm_series(ablation_path, True)
resizer = ResizeSegmentation(ablation_segmentation_sitk, tumor_segmentation_sitk) # object instantiation
tumor_segmentation_resampled = resizer.resample_segmentation() # SimpleITK image object
The actual resampling operation is performed using the SimpleITK ResampleImageFilter
and NearestNeighbour Interpolation
such that no new segmentation mask labels are generated.
The segmentation Evaluation Metrics are called from the script C_mainDistanceVolumeMetrics.py
which calls:
DistanceMetrics.py
VolumeMetrics.py
RadiomicsMetrics
- features extracted using the Python library PyRadiomics
. You can find a list of all the featuers that can be computed using this PyRadiomics here. I used only the volumes, sphericity, intensity and axis values.
The same as for Resampling, both these scripts take as input arguments SimpleITK image objects. They can be called for example like:
surface_distance_metrics = DistanceMetrics(ablation_segmentation, tumor_segmentation_resampled)
ablation_radiomics_metrics = RadiomicsMetrics(source_ct_ablation, ablation_segmentation)
evaloverlap = VolumeMetrics()
evaloverlap.set_image_object(ablation_segmentation, tumor_segmentation_resampled)
The inner (green) and outer (orange) ellipsoidal approximations of a segmented object (blue) are calculated using convex optimization according to "S. P. Boyd and L. Vandenberghe, Convex optimization. Cambridge, UK ; New York: Cambridge University Press, 2004." Their implementation in CVXPY was employed to compute the ellipsoids. The outer volume was computed using SVD and the inner volume was computed as the sqrt(det(B)) * Ball(0,1)
.
The output is a Excel (.xlsx) file in tabular format that returns radiomics (feature values) per patient and per lesion.
Aditionally a histogram that describes the Euclidean distances between the tumor and ablation (my 2 segmentations files) are generated. The script for plotting the histogram uses the Surface Euclidean Distances extracted using SimpleITK using the Maurer et. al algorithm is in scripts/plot_ablation_margin_hist.py
.
To compute the histogram the following steps are followed for the Maurer et. al algorithm:
Inner and Outer Ellipsoids Fitted around a segmentation (orange points).
The patient data consists of files and folder has the following folder structure and organization:
`Patient_C001
|Series_1
|CAS-Recordings (* .xml files)
| 2016-09-13_07-15-32 ...
|Segmentations
|Segmentation_No3
|0001.dcm
|0002.dcm
|003.dcm
|CT.1.2.392...dcm
|CT.1.2.393...dcm
|CT.1.2.3..dcm
|Series_2
|Series_3`