Study of impact of relapse on survival

In this issue, I detail the investigation into the impact of relapse in survival rates. This was done in the code model_training/primitive_vs_metastasis_model.py with the following command:

python -W ignore model_training/primitive_vs_metastasis_model.py --input ../data/merged_data.csv --output ../output2

The code investigates the following:

the performance of a model predicting survival for patients with primary cancer
the performance of a model predicting survival for patients with a metastasis
the performance of a model predicting survival for all patients, but having as feature if the patient has a primary cancer or a metastasis

Here are the outputs of the model:

Total number of patients 163
Number of primitive patients 85
Number of metastasis patients 78

 ------------- Model for prediction of survival for primitive patients -------------
Feature data shape: (85, 145)
Target data shape: (85, 2)
Number of subjects which died: 16

Number of subject for training: 69
Number of subject for testing: 16

Number of subjects that died within 3 year: 12

Number of subjects that died within 3 year (train): 10
Number of subjects that died within 3 year (test): 2

Model performance on the deadline of 3 year with 145 features
ROC AUC Score: 0.8571428571428571
Brier score: 0.11149794478131589
Average precision: 0.0
Average Recall: 0.0
Accuracy Score: 0.875
AUC-PR score: 0.5625

 --- Feature selection ---
Initial number of features: 28
Number of subject for training: 69
Number of subject for testing: 16

Model performance with 3 year deadline with 28 features
ROC AUC Score:  0.9333333333333333
Brier score: 0.043331586264660604
Average precision: 0.9375
Average Recall: 1.0
Accuracy Score:  0.9375
AUC-PR score: 0.96875

Number of features after variance thresholding: 22
Number of features removed by variance thresholding: 6

Model performance after variance thresholding with 3 year deadline and 22 features
ROC AUC Score:  0.9333333333333333
Brier score: 0.043331586264660604
Average precision: 0.9375
Average Recall: 1.0
Accuracy Score:  0.9375
AUC-PR score: 0.96875

Number of features after correlation thresholding: 18
Number of features removed by correlation thresholding: 4

Model performance after feature selection based on correlation (3-year deadline) with 18 features
ROC AUC Score:  1.0
Brier score: 0.0306842485044323
Average precision: 0.9375
Average Recall: 1.0
Accuracy Score:  0.9375
AUC-PR score: 0.96875

Number of features after correlation with target thresholding: 15
Number of features removed by correlation with target thresholding: 3

Final features of the model:
['age', 'BMI', 'tabac_PA', 'histo', 'dose_tot', 'etalement', 'couv_PTV', 'BED_10', 'INTENSITY-BASED_IntensitySkewness', 'INTENSITY-BASED_IntensityKurtosis', 'INTENSITY-BASED_AreaUnderCurveCIVH', 'INTENSITY-HISTOGRAM_IntensityHistogramMean', 'INTENSITY-HISTOGRAM_IntensityHistogramVariance', 'NGTDM_Complexity', 'NGTDM_Strength']

Model performance after feature selection based on correlation with target (3-year deadline) with 15 features
ROC AUC Score:  0.9333333333333333
Brier score: 0.0563129837679206
Average precision: 0.9375
Average Recall: 1.0
Accuracy Score:  0.9375
AUC-PR score: 0.96875
Confusion matrix:
TN: 0
FP: 1
FN: 0
TP: 15

 ------------- Model for prediction of survival for metastasis patients -------------
Feature data shape: (78, 145)
Target data shape: (78, 2)
Number of subjects which died: 31

Number of subject for training: 67
Number of subject for testing: 11

Number of subjects that died within 3 year: 21

Number of subjects that died within 3 year (train): 17
Number of subjects that died within 3 year (test): 4

Model performance on the deadline of 3 year with 145 features
ROC AUC Score: 0.8571428571428572
Brier score: 0.12062535732896305
Average precision: 1.0
Average Recall: 0.5
Accuracy Score: 0.8181818181818182
AUC-PR score: 0.8409090909090909

 --- Feature selection ---
Initial number of features: 28
Number of subject for training: 67
Number of subject for testing: 11

Model performance with 3 year deadline with 28 features
ROC AUC Score:  0.5
Brier score: 0.15394804821923735
Average precision: 0.9
Average Recall: 0.9
Accuracy Score:  0.8181818181818182
AUC-PR score: 0.9454545454545454

Number of features after variance thresholding: 24
Number of features removed by variance thresholding: 4

Model performance after variance thresholding with 3 year deadline and 24 features
ROC AUC Score:  0.5
Brier score: 0.15394804821923735
Average precision: 0.9
Average Recall: 0.9
Accuracy Score:  0.8181818181818182
AUC-PR score: 0.9454545454545454

Number of features after correlation thresholding: 19
Number of features removed by correlation thresholding: 5

Model performance after feature selection based on correlation (3-year deadline) with 19 features
ROC AUC Score:  0.5
Brier score: 0.09713635330326395
Average precision: 0.9090909090909091
Average Recall: 1.0
Accuracy Score:  0.9090909090909091
AUC-PR score: 0.9545454545454546

Number of features after correlation with target thresholding: 16
Number of features removed by correlation with target thresholding: 3

Final features of the model:
['BMI', 'score_charlson', 'OMS', 'histo', 'T', 'dose_tot', 'etalement', 'vol_GTV', 'couv_PTV', 'INTENSITY-BASED_MeanIntensity', 'INTENSITY-BASED_IntensitySkewness', 'INTENSITY-BASED_IntensityKurtosis', 'INTENSITY-BASED_AreaUnderCurveCIVH', 'INTENSITY-HISTOGRAM_IntensityHistogramMean', 'INTENSITY-HISTOGRAM_IntensityHistogramVariance', 'NGTDM_Strength']

Model performance after feature selection based on correlation with target (3-year deadline) with 16 features
ROC AUC Score:  0.4
Brier score: 0.11719389164990067
Average precision: 0.9
Average Recall: 0.9
Accuracy Score:  0.8181818181818182
AUC-PR score: 0.9454545454545454
Confusion matrix:
TN: 0
FP: 1
FN: 1
TP: 9

 ------------- Model for prediction of survival for all patients -------------
Feature data shape: (163, 146)
Target data shape: (163, 2)
Number of subjects which died: 47

Number of subject for training: 136
Number of subject for testing: 27

Number of subjects that died within 3 year: 33

Number of subjects that died within 3 year (train): 27
Number of subjects that died within 3 year (test): 6

Model performance on the deadline of 3 year with 146 features
ROC AUC Score: 0.873015873015873
Brier score: 0.135983463216857
Average precision: 0.6666666666666666
Average Recall: 0.3333333333333333
Accuracy Score: 0.8148148148148148
AUC-PR score: 0.5740740740740741

 --- Feature selection ---
Initial number of features: 29
Number of subject for training: 136
Number of subject for testing: 27

Model performance with 3 year deadline with 29 features
ROC AUC Score:  0.54
Brier score: 0.07281945477152565
Average precision: 0.9259259259259259
Average Recall: 1.0
Accuracy Score:  0.9259259259259259
AUC-PR score: 0.962962962962963

Number of features after variance thresholding: 22
Number of features removed by variance thresholding: 7

Model performance after variance thresholding with 3 year deadline and 22 features
ROC AUC Score:  0.66
Brier score: 0.0722955884260542
Average precision: 0.9259259259259259
Average Recall: 1.0
Accuracy Score:  0.9259259259259259
AUC-PR score: 0.962962962962963

Number of features after correlation thresholding: 17
Number of features removed by correlation thresholding: 5

Model performance after feature selection based on correlation (3-year deadline) with 17 features
ROC AUC Score:  0.72
Brier score: 0.06265992975503183
Average precision: 0.9259259259259259
Average Recall: 1.0
Accuracy Score:  0.9259259259259259
AUC-PR score: 0.962962962962963

Number of features after correlation with target thresholding: 15
Number of features removed by correlation with target thresholding: 2

Final features of the model:
['BMI', 'score_charlson', 'tabac_PA', 'dose_tot', 'etalement', 'vol_GTV', 'couv_PTV', 'BED_10', 'INTENSITY-BASED_MeanIntensity', 'INTENSITY-BASED_IntensitySkewness', 'INTENSITY-BASED_IntensityKurtosis', 'INTENSITY-BASED_AreaUnderCurveCIVH', 'INTENSITY-HISTOGRAM_IntensityHistogramMean', 'INTENSITY-HISTOGRAM_IntensityHistogramVariance', 'NGTDM_Strength']

Model performance after feature selection based on correlation with target (3-year deadline) with 15 features
ROC AUC Score:  0.72
Brier score: 0.0677152051168577
Average precision: 0.9259259259259259
Average Recall: 1.0
Accuracy Score:  0.9259259259259259
AUC-PR score: 0.962962962962963
Confusion matrix:
TN: 0
FP: 2
FN: 0
TP: 25

Here is my opinion on the question:

for the first two models and even though the first model has relatively good performances, I think that the size of the test set is too small to make certain conclusions on the performances of the model
surprisingly, for the last model, after feature selection the feature describing primary vs metastasis was not selected. Therefore, I feel that this strategy is not the way to go.

plbenveniste / lung-treatment-response

Study of impact of relapse on survival #7