Rak/fit thresholds per class

rak5216 commented 4 years ago

issue #59

rak5216 commented 4 years ago

Input/output for fit_thresholds involves metadata.yaml files.

Sample workflow:

Train with Model Metadata output

python3 train_segmentation_model.py --gcp-bucket gs://necstlab-sandbox --config-file configs/config_sandbox/train-small-3class.yaml

created_datetime: 20200317T025722Z
current_global_threshold_for_reference: 0.5
dataset_config:
class_annotation_mapping:
class_0_annotation_GVs: [100]
class_1_annotation_GVs: [175]
class_2_annotation_GVs: [250]
dataset_split:
test: [8bit_AS4_S2_P1_L6_2560_1750_2160]
train: [THIN_REF_S2_P1_L3_2496_1563_2159]
validation: [THIN_CNT_S2_P1_L4_2334_1578_2159]
image_cropping: {num_per_image: 1, type: random}
stack_downsampling: {num_skip_beg_slices: 0, num_skip_end_slices: 0, number_of_images: 100,
type: linear}
target_size: &id001 [512, 512]
elapsed_minutes: 4.7
gcp_bucket: gs://necstlab-sandbox
git_hash: cf6723bb22ac87c5aba080f92ac63aa12c0b7ddf
num_classes: 3
original_config_filename: configs/config_sandbox/train-small-3class.yaml
prediction_thresholds_optimized:
class_0: {fun: 0.999891996383667, message: Solution found., nfev: 9, status: 0,
success: true, x: 0.6246117974981072}
class_1: {fun: 1.0, message: Solution found., nfev: 11, status: 0, success: true,
x: 0.9952027292893504}
class_2: {fun: 1.0, message: Solution found., nfev: 11, status: 0, success: true,
x: 0.9952027292893504}
target_size: *id001
threshold_optimization_configuration:
opt_bounds: [0, 1]
opt_class_metric: iou_score_1H
opt_dataset_downsample_factor: 1.0
opt_dataset_generator: tmp/datasets/dataset-small-3class/validation
opt_method: bounded
opt_options: {disp: true, maxiter: 1000}
opt_tol: 0.01

Test using optim thresholds from train/val and Test Metadata output (timestamped now)

python3 test_segmentation_model.py --gcp-bucket gs://necstlab-sandbox --dataset-id dataset-small-3class --model-id segmentation-model-small-3class_20200317T025254Z --batch-size 16

batch_size: 16
created_datetime: 20200317T030012Z
current_global_threshold_for_reference: 0.5
dataset_config:
class_annotation_mapping:
class_0_annotation_GVs: [100]
class_1_annotation_GVs: [175]
class_2_annotation_GVs: [250]
dataset_split:
test: [8bit_AS4_S2_P1_L6_2560_1750_2160]
train: [THIN_REF_S2_P1_L3_2496_1563_2159]
validation: [THIN_CNT_S2_P1_L4_2334_1578_2159]
image_cropping: {num_per_image: 1, type: random}
stack_downsampling: {num_skip_beg_slices: 0, num_skip_end_slices: 0, number_of_images: 100,
type: linear}
target_size: [512, 512]
dataset_id: dataset-small-3class
elapsed_minutes: 0.6
gcp_bucket: gs://necstlab-sandbox
git_hash: cf6723bb22ac87c5aba080f92ac63aa12c0b7ddf
model_id: segmentation-model-small-3class_20200317T025254Z
optimized_class_thresholds_used: {class_0: 0.6246117974981072, class_1: 0.9952027292893504,
class_2: 0.9952027292893504}
threshold_metadata_root_path: null
train_config:
batch_size: 16
data_augmentation: {random_90-degree_rotations: true}
dataset_id: dataset-small-3class
epochs: 10
loss: cross_entropy
model_id_prefix: segmentation-model-small-3class
optimizer: adam
segmentation_model:
model_name: Unet
model_parameters: {backbone_name: vgg16, encoder_weights: null}
training_data_shuffle_seed: 1234

Infer using optim thresholds from train/val and Infer Metadata output

python3 infer_segmentation.py --gcp-bucket gs://necstlab-sandbox --stack-id 8bit_AS4_S2_P1_L6_2560_1750_2160 --model-id segmentation-model-small-3class_20200317T025254Z --image-ids 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif --labels-output False --pad-output False

background_class_index: null
created_datetime: 20200317T035158Z
elapsed_minutes: 2.0
gcp_bucket: gs://necstlab-sandbox
git_hash: cf6723bb22ac87c5aba080f92ac63aa12c0b7ddf
image_ids: 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif
labels_output: false
loaded_optimized_class_thresholds: {class_0: 0.6246117974981072, class_1: 0.9952027292893504,
  class_2: 0.9952027292893504}
model_id: segmentation-model-small-3class_20200317T025254Z
pad_output: false
prediction_thresholds_used: '[0.6246118  0.99520273 0.99520273]'
stack_id: 8bit_AS4_S2_P1_L6_2560_1750_2160
threshold_metadata_root_path: null
user_specified_prediction_thresholds: null

Fit generator to try different optimization setups and metadata output

python3 fit_segmentation_model_prediction_thresholds.py --gcp-bucket gs://necstlab-sandbox --dataset-directory dataset-small_2cropClass_3class_VarMinPx_final/validation --model-id segmentation-model-small-3class_20200317T025254Z --batch-size 16 --optimizing-class-metric iou_score_1H --dataset-downsample-factor 0.1

batch_size: 16
created_datetime: 20200317T042345Z
dataset_config:
class_annotation_mapping:
class_0_annotation_GVs: [100]
class_1_annotation_GVs: [175]
class_2_annotation_GVs: [250]
dataset_split:
test: [8bit_AS4_S2_P1_L6_2560_1750_2160]
train: [THIN_REF_S2_P1_L3_2496_1563_2159]
validation: [THIN_CNT_S2_P1_L4_2334_1578_2159]
image_cropping: {num_per_image: 1, type: random}
stack_downsampling: {num_skip_beg_slices: 0, num_skip_end_slices: 0, number_of_images: 100,
type: linear}
target_size: &id001 [512, 512]
dataset_directory: dataset-small-3class/test
elapsed_minutes: 2.5
gcp_bucket: gs://necstlab-sandbox
git_hash: cf6723bb22ac87c5aba080f92ac63aa12c0b7ddf
model_id: segmentation-model-small-3class_20200317T025254Z
num_classes: 3
prediction_thresholds_optimized:
class_0: {fun: 0.9994456424028613, message: Solution found., nfev: 11, status: 0,
success: true, x: 0.006806242506865075}
class_1: {fun: 0.9998764211632079, message: Solution found., nfev: 12, status: 0,
success: true, x: 0.3262379212492639}
class_2: {fun: 1.0, message: Solution found., nfev: 11, status: 0, success: true,
x: 0.9952027292893504}
target_size: *id001
threshold_optimization_configuration:
opt_bounds: [0, 1]
opt_class_metric: iou_score_1H
opt_dataset_downsample_factor: 0.15
opt_dataset_generator: tmp/datasets/dataset-small-3class/test
opt_method: bounded
opt_options: {disp: true, maxiter: 1000}
opt_tol: 0.01
train_config:
batch_size: 16
data_augmentation: {random_90-degree_rotations: true}
dataset_id: dataset-small-3class
epochs: 10
loss: cross_entropy
model_id_prefix: segmentation-model-small-3class
optimizer: adam
segmentation_model:
model_name: Unet
model_parameters: {backbone_name: vgg16, encoder_weights: null}
training_data_shuffle_seed: 1234

Test (or infer) using specified fit thresholds metadata (i.e., not the default optim thresholds from train/val)

python3 test_segmentation_model.py --gcp-bucket gs://necstlab-sandbox --dataset-id dataset-small-3class --model-id segmentation-model-small-3class_20200317T025254Z --batch-size 16 --fit-metadata-root-path fit_thresholds_segmentation-model-small-3class_20200317T025254Z_dataset-small_2cropClass_3class_VarMinPx_final_iou_score_1H/metadata_fit_thresholds_output_20200317T033258Z.yaml

python3 infer_segmentation.py --gcp-bucket gs://necstlab-sandbox --stack-id 8bit_AS4_S2_P1_L6_2560_1750_2160 --model-id segmentation-model-small-3class_20200317T025254Z --image-ids 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif --labels-output False --pad-output False --fit-metadata-root-path fit_thresholds_segmentation-model-small-3class_20200317T025254Z_dataset-small_2cropClass_3class_VarMinPx_final_iou_score_1H/metadata_fit_thresholds_output_20200317T033258Z.yaml

background_class_index: null
created_datetime: 20200317T034910Z
elapsed_minutes: 2.0
gcp_bucket: gs://necstlab-sandbox
git_hash: cf6723bb22ac87c5aba080f92ac63aa12c0b7ddf
image_ids: 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif
labels_output: false
loaded_optimized_class_thresholds: {class_0: 0.039239138539631235, class_1: 0.9952027292893504,
class_2: 0.4078724984571376}
model_id: segmentation-model-small-3class_20200317T025254Z
pad_output: false
prediction_thresholds_used: '[0.03923914 0.99520273 0.4078725 ]'
stack_id: 8bit_AS4_S2_P1_L6_2560_1750_2160
threshold_metadata_root_path: fit_thresholds_segmentation-model-small-3class_20200317T025254Z_dataset-small_2cropClass_3class_VarMinPx_final_iou_score_1H/metadata_fit_thresholds_output_20200317T033258Z.yaml
user_specified_prediction_thresholds: null

Josh-Joseph commented 4 years ago

suggest changing to one file called model-thresholds_<DATETIME>Z.yaml

thresholds: 
  class_0: 0.999891996383667
  class_1: 0.9952027292893504
  class_2: 0.9952027292893504
metadata:
   created_datetime: 20200317T025722Z
   current_global_threshold_for_reference: 0.5
   dataset_config:
     class_annotation_mapping:
       class_0_annotation_GVs: [100]
       class_1_annotation_GVs: [175]
       class_2_annotation_GVs: [250]
  dataset_split:
    test: [8bit_AS4_S2_P1_L6_2560_1750_2160]
    train: [THIN_REF_S2_P1_L3_2496_1563_2159]
    validation: [THIN_CNT_S2_P1_L4_2334_1578_2159]
  image_cropping: {num_per_image: 1, type: random}
  stack_downsa
  ...
  optimizer_output:
    <all output goes here>

Josh-Joseph commented 4 years ago

in train thresholds if any fail error out

Josh-Joseph commented 4 years ago

remove auto-threshold from train segmentation model

In train workflow:

add an extra step that is train thresholds (add to readme)

rak5216 commented 4 years ago

UPDATED SAMPLE RESULTS AFTER CHANGES DONE

Train_thresholds (auto train thresh removed from train model) output:

python3 train_segmentation_model_prediction_thresholds.py --gcp-bucket gs://necstlab-sandbox --dataset-directory dataset-small-3class/test --model-id segmentation-model-small-3class_20200321T180512Z --batch-size 16 --optimizing-class-metric binary_accuracy_tfkeras_1H --dataset-downsample-factor 0.1

metadata:
batch_size: 16
created_datetime: 20200321T181145Z
dataset_config:
class_annotation_mapping:
  class_0_annotation_GVs: [100]
  class_1_annotation_GVs: [175]
  class_2_annotation_GVs: [250]
dataset_split:
  test: [8bit_AS4_S2_P1_L6_2560_1750_2160]
  train: [THIN_REF_S2_P1_L3_2496_1563_2159]
  validation: [THIN_CNT_S2_P1_L4_2334_1578_2159]
image_cropping: {num_per_image: 1, type: random}
stack_downsampling: {num_skip_beg_slices: 0, num_skip_end_slices: 0, number_of_images: 100,
  type: linear}
target_size: &id001 [512, 512]
dataset_directory: dataset-small-3class/test
elapsed_minutes: 1.5
gcp_bucket: gs://necstlab-sandbox
git_hash: d3ed7f5d2047a7cd811ed9e0020b83de30316f93
model_id: segmentation-model-small-3class_20200321T180512Z
num_classes: 3
target_size: *id001
thresholds_training_configuration:
opt_bounds: [0, 1]
opt_class_metric: binary_accuracy_tfkeras_1H
opt_dataset_downsample_factor: 0.1
opt_dataset_generator: tmp/datasets/dataset-small-3class/test
opt_method: bounded
opt_options: {disp: 3, maxiter: 1000}
opt_tol: 0.1
thresholds_training_output:
class0: {fun: 0.9855868816375732, message: Solution found., nfev: 6, status: 0,
  success: true, x: 0.055728090000841224}
class1: {fun: 0.0009055137634277344, message: Solution found., nfev: 5, status: 0,
  success: true, x: 0.657649662042785}
class2: {fun: 1.3828277587890625e-05, message: Solution found., nfev: 5, status: 0,
  success: true, x: 0.657649662042785}
train_config:
batch_size: 16
data_augmentation: {random_90-degree_rotations: true}
dataset_id: dataset-small-3class
epochs: 10
loss: cross_entropy
model_id_prefix: segmentation-model-small-3class
optimizer: adam
segmentation_model:
  model_name: Unet
  model_parameters: {backbone_name: vgg16, encoder_weights: null}
training_data_shuffle_seed: 1234
trained_prediction_thresholds: {class0: 0.055728090000841224, class1: 0.657649662042785,
class2: 0.657649662042785}

model-dir

Test (without trained thresh id --> use global default) output:

python3 test_segmentation_model.py --gcp-bucket gs://necstlab-sandbox --dataset-id dataset-small-3class --model-id segmentation-model-small-3class_20200321T180512Z --batch-size 16

batch_size: 16
created_datetime: 20200321T180938Z
dataset_config:
class_annotation_mapping:
class_0_annotation_GVs: [100]
class_1_annotation_GVs: [175]
class_2_annotation_GVs: [250]
dataset_split:
test: [8bit_AS4_S2_P1_L6_2560_1750_2160]
train: [THIN_REF_S2_P1_L3_2496_1563_2159]
validation: [THIN_CNT_S2_P1_L4_2334_1578_2159]
image_cropping: {num_per_image: 1, type: random}
stack_downsampling: {num_skip_beg_slices: 0, num_skip_end_slices: 0, number_of_images: 100,
type: linear}
target_size: [512, 512]
dataset_id: dataset-small-3class
default_global_threshold_for_reference: 0.5
elapsed_minutes: 0.6
gcp_bucket: gs://necstlab-sandbox
git_hash: d3ed7f5d2047a7cd811ed9e0020b83de30316f93
model_id: segmentation-model-small-3class_20200321T180512Z
train_config:
batch_size: 16
data_augmentation: {random_90-degree_rotations: true}
dataset_id: dataset-small-3class
epochs: 10
loss: cross_entropy
model_id_prefix: segmentation-model-small-3class
optimizer: adam
segmentation_model:
model_name: Unet
model_parameters: {backbone_name: vgg16, encoder_weights: null}
training_data_shuffle_seed: 1234
trained_class_thresholds_loaded: null
trained_thresholds_id: null

Test (WITH trained thresh id specified) output:

python3 test_segmentation_model.py --gcp-bucket gs://necstlab-sandbox --dataset-id dataset-small-3class --model-id segmentation-model-small-3class_20200321T180512Z --batch-size 16 --trained-thresholds-id model_thresholds_20200321T181016Z.yaml

batch_size: 16
created_datetime: 20200321T181332Z
dataset_config:
class_annotation_mapping:
class_0_annotation_GVs: [100]
class_1_annotation_GVs: [175]
class_2_annotation_GVs: [250]
dataset_split:
test: [8bit_AS4_S2_P1_L6_2560_1750_2160]
train: [THIN_REF_S2_P1_L3_2496_1563_2159]
validation: [THIN_CNT_S2_P1_L4_2334_1578_2159]
image_cropping: {num_per_image: 1, type: random}
stack_downsampling: {num_skip_beg_slices: 0, num_skip_end_slices: 0, number_of_images: 100,
type: linear}
target_size: [512, 512]
dataset_id: dataset-small-3class
default_global_threshold_for_reference: 0.5
elapsed_minutes: 0.6
gcp_bucket: gs://necstlab-sandbox
git_hash: d3ed7f5d2047a7cd811ed9e0020b83de30316f93
model_id: segmentation-model-small-3class_20200321T180512Z
train_config:
batch_size: 16
data_augmentation: {random_90-degree_rotations: true}
dataset_id: dataset-small-3class
epochs: 10
loss: cross_entropy
model_id_prefix: segmentation-model-small-3class
optimizer: adam
segmentation_model:
model_name: Unet
model_parameters: {backbone_name: vgg16, encoder_weights: null}
training_data_shuffle_seed: 1234
trained_class_thresholds_loaded: {class0: 0.055728090000841224, class1: 0.657649662042785,
class2: 0.657649662042785}
trained_thresholds_id: model_thresholds_20200321T181016Z.yaml

Infer (without trained thresh id --> use global default) output:

python3 infer_segmentation.py --gcp-bucket gs://necstlab-sandbox --stack-id 8bit_AS4_S2_P1_L6_2560_1750_2160 --model-id segmentation-model-small-3class_20200321T180512Z --image-ids 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif --labels-output False --pad-output False

{background_class_index: null, created_datetime: 20200321T181723Z, default_global_threshold_for_reference: 0.5,
elapsed_minutes: 2.0, gcp_bucket: 'gs://necstlab-sandbox', git_hash: d3ed7f5d2047a7cd811ed9e0020b83de30316f93,
image_ids: 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif, labels_output: false, model_id: segmentation-model-small-3class_20200321T180512Z,
pad_output: false, prediction_thresholds_used: '[0.5 0.5 0.5]', stack_id: 8bit_AS4_S2_P1_L6_2560_1750_2160,
trained_class_thresholds_loaded: null, trained_thresholds_id: null, user_specified_prediction_thresholds: null}

Infer (WITH trained thresh id specified) output:

python3 infer_segmentation.py --gcp-bucket gs://necstlab-sandbox --stack-id 8bit_AS4_S2_P1_L6_2560_1750_2160 --model-id segmentation-model-small-3class_20200321T180512Z --image-ids 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif --labels-output False --pad-output False --trained-thresholds-id model_thresholds_20200321T181016Z.yaml

background_class_index: null
created_datetime: 20200321T182052Z
default_global_threshold_for_reference: 0.5
elapsed_minutes: 2.0
gcp_bucket: gs://necstlab-sandbox
git_hash: d3ed7f5d2047a7cd811ed9e0020b83de30316f93
image_ids: 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif
labels_output: false
model_id: segmentation-model-small-3class_20200321T180512Z
pad_output: false
prediction_thresholds_used: '[0.05572809 0.65764966 0.65764966]'
stack_id: 8bit_AS4_S2_P1_L6_2560_1750_2160
trained_class_thresholds_loaded: {class0: 0.055728090000841224, class1: 0.657649662042785,
class2: 0.657649662042785}
trained_thresholds_id: model_thresholds_20200321T181016Z.yaml
user_specified_prediction_thresholds: null

rak5216 commented 4 years ago

@Josh-Joseph ready to review again

Josh-Joseph commented 4 years ago

'[0.05572809 0.65764966 0.65764966]' -> list(prediction_threshold)

(instead of str(prediction_threshold))

mit-quest / necstlab-damage-segmentation

Rak/fit thresholds per class #61

UPDATED SAMPLE RESULTS AFTER CHANGES DONE