Keane Gonzalez
Code and script for Ultrasound Cancer tumor detection using Faster R-CNN with a ResNet50 pre-trained backbone
-- Pre-Processing and Results scripts are also in this area.
Data:
Environment: Jupyter Notebook in Google Colab (Pro for the extended training time and memory)
Code Flow:
TBD
Output Example Images with Predictions and Truth Regions:
Currently being retrained/retested without ImageNet normalization
Steps for usage
This is the main set of options in cell 1. There are two main modes: training and testing. Set run_testing_only to 1 to load the necessary functions and stop right before the training starts. This allows the user to run the code segment that loads particular saved models for testing. Setting run_testing_only to 0 will cause the code to go through the RESNET training sequence
################################################################################
# MODEL SPECIFIC OPTIONS
#
# Choose training sets to use
training_set = 0 #0=UCLA US, 1 = UCLA + BUSI, 2 = BUSI
#choose the normalization path to use
#0 is the 0 to 1 Normalization based upon the ultrasound data
#1 is the 0 to 1 Normalization based upon the ImageNet data
#
#
model_branch = 1 #0=0to1, 1 = -2to2
transfer_train = 0 #we will load the model states from a checkpoint epoch instead
#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
# TURN TRAINING ON OR OFF
# Choose to do training or to load models for use in testing only
# run_testing_only = 0 is for training, =1 for a stop after loading models. 1 is
# used for running a model on the Test dataset
run_testing_only = 1
# Set use_last_epoch to 1 to start from a pre-trained checkpoint. Setting to 0
#starts training from the beginning
use_last_epoch = 1 #
################################################################################
TBD
TESTING Once the Training has completed or if staring from a pre-generated model, set run_testin_only to 1. This will kick you out of the main training cell and allow you to load the model separately from all the previous data. The function tab below is where the user can choose which model to load:
These are the most important sections needed to load in the model. Data selected for testing is very important as it must have been regenerated with this model or saved previously.
The checkpoint method is reused to load the model, with state dict being necessary when saved. Epoch is not necessary, but was automatically included from the training saves. For the UCLA runs, the same training/validation/test data was reused throughout the whole training sequences to reduce any chance of data being mixed between them. For normal runs, data info was saved along with any other info needed for testing.
checkpoint = torch.load(model_dict)
modelr.load_state_dict(checkpoint['model_state_dict'])
#optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
start_epoch = checkpoint['epoch']
#
# Load relevant saved train/validation/test file sets
#
print('Loading previous set of train/val/test files')
#subdir = '0TO1_UCLA_BEST'
last_data_list = os.path.join(model_dir, subdir,'last_data_set.pickle') #0TO1
#'/content/gdrive/My Drive/BreastUS/MODEL_SAVE/UCLA_FINAL/last_data_set.pickle'
archived_data = pickle.load( open( last_data_list, "rb" ) )
training_data = archived_data[0][0]
validation_data = archived_data[0][1]
test_data =archived_data[0][2]
bounding_box = archived_data[0][3]
first50 =archived_data[0][4]
To test out the model, jump to the section below:
This function will take in a cutoff and data for the model. The cell below, Run UCLA METRICS, will call this function for cutoffs from 0 - 90% confidence scores.
for loop in range(0,10,1):
score_cutoff = float(loop/10) #to get floating point step sizes
test_metrics = test_ucla_classification(modelr,input_dataset=test_data,
batch_size=batch_size,
score_cutoff=score_cutoff,
iou_cutoff = iou_cutoff)
saved_metrics[loop] = test_metrics
The output will be saved for use in plotting in the next cell. All of the useful metrics (TP/FP/TN/FN, etc...) will be saved for each confidence cutoff
A generic set of plots produced by this cell, taken from the confidence score calculations saved as pickle files:
NOTES:
References: