mit-quest / necstlab-damage-segmentation

MIT License
5 stars 6 forks source link

Rak/fit thresholds per class #61

Closed rak5216 closed 4 years ago

rak5216 commented 4 years ago

issue #59

rak5216 commented 4 years ago

Input/output for fit_thresholds involves metadata.yaml files.

Sample workflow:

  1. Train with Model Metadata output

    python3 train_segmentation_model.py --gcp-bucket gs://necstlab-sandbox --config-file configs/config_sandbox/train-small-3class.yaml

    image

    created_datetime: 20200317T025722Z
    current_global_threshold_for_reference: 0.5
    dataset_config:
    class_annotation_mapping:
    class_0_annotation_GVs: [100]
    class_1_annotation_GVs: [175]
    class_2_annotation_GVs: [250]
    dataset_split:
    test: [8bit_AS4_S2_P1_L6_2560_1750_2160]
    train: [THIN_REF_S2_P1_L3_2496_1563_2159]
    validation: [THIN_CNT_S2_P1_L4_2334_1578_2159]
    image_cropping: {num_per_image: 1, type: random}
    stack_downsampling: {num_skip_beg_slices: 0, num_skip_end_slices: 0, number_of_images: 100,
    type: linear}
    target_size: &id001 [512, 512]
    elapsed_minutes: 4.7
    gcp_bucket: gs://necstlab-sandbox
    git_hash: cf6723bb22ac87c5aba080f92ac63aa12c0b7ddf
    num_classes: 3
    original_config_filename: configs/config_sandbox/train-small-3class.yaml
    prediction_thresholds_optimized:
    class_0: {fun: 0.999891996383667, message: Solution found., nfev: 9, status: 0,
    success: true, x: 0.6246117974981072}
    class_1: {fun: 1.0, message: Solution found., nfev: 11, status: 0, success: true,
    x: 0.9952027292893504}
    class_2: {fun: 1.0, message: Solution found., nfev: 11, status: 0, success: true,
    x: 0.9952027292893504}
    target_size: *id001
    threshold_optimization_configuration:
    opt_bounds: [0, 1]
    opt_class_metric: iou_score_1H
    opt_dataset_downsample_factor: 1.0
    opt_dataset_generator: tmp/datasets/dataset-small-3class/validation
    opt_method: bounded
    opt_options: {disp: true, maxiter: 1000}
    opt_tol: 0.01
  2. Test using optim thresholds from train/val and Test Metadata output (timestamped now)

    python3 test_segmentation_model.py --gcp-bucket gs://necstlab-sandbox --dataset-id dataset-small-3class --model-id segmentation-model-small-3class_20200317T025254Z --batch-size 16 

    image

    batch_size: 16
    created_datetime: 20200317T030012Z
    current_global_threshold_for_reference: 0.5
    dataset_config:
    class_annotation_mapping:
    class_0_annotation_GVs: [100]
    class_1_annotation_GVs: [175]
    class_2_annotation_GVs: [250]
    dataset_split:
    test: [8bit_AS4_S2_P1_L6_2560_1750_2160]
    train: [THIN_REF_S2_P1_L3_2496_1563_2159]
    validation: [THIN_CNT_S2_P1_L4_2334_1578_2159]
    image_cropping: {num_per_image: 1, type: random}
    stack_downsampling: {num_skip_beg_slices: 0, num_skip_end_slices: 0, number_of_images: 100,
    type: linear}
    target_size: [512, 512]
    dataset_id: dataset-small-3class
    elapsed_minutes: 0.6
    gcp_bucket: gs://necstlab-sandbox
    git_hash: cf6723bb22ac87c5aba080f92ac63aa12c0b7ddf
    model_id: segmentation-model-small-3class_20200317T025254Z
    optimized_class_thresholds_used: {class_0: 0.6246117974981072, class_1: 0.9952027292893504,
    class_2: 0.9952027292893504}
    threshold_metadata_root_path: null
    train_config:
    batch_size: 16
    data_augmentation: {random_90-degree_rotations: true}
    dataset_id: dataset-small-3class
    epochs: 10
    loss: cross_entropy
    model_id_prefix: segmentation-model-small-3class
    optimizer: adam
    segmentation_model:
    model_name: Unet
    model_parameters: {backbone_name: vgg16, encoder_weights: null}
    training_data_shuffle_seed: 1234
  3. Infer using optim thresholds from train/val and Infer Metadata output

    python3 infer_segmentation.py --gcp-bucket gs://necstlab-sandbox --stack-id 8bit_AS4_S2_P1_L6_2560_1750_2160 --model-id segmentation-model-small-3class_20200317T025254Z --image-ids 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif --labels-output False --pad-output False

    image

background_class_index: null
created_datetime: 20200317T035158Z
elapsed_minutes: 2.0
gcp_bucket: gs://necstlab-sandbox
git_hash: cf6723bb22ac87c5aba080f92ac63aa12c0b7ddf
image_ids: 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif
labels_output: false
loaded_optimized_class_thresholds: {class_0: 0.6246117974981072, class_1: 0.9952027292893504,
  class_2: 0.9952027292893504}
model_id: segmentation-model-small-3class_20200317T025254Z
pad_output: false
prediction_thresholds_used: '[0.6246118  0.99520273 0.99520273]'
stack_id: 8bit_AS4_S2_P1_L6_2560_1750_2160
threshold_metadata_root_path: null
user_specified_prediction_thresholds: null
  1. Fit generator to try different optimization setups and metadata output

    python3 fit_segmentation_model_prediction_thresholds.py --gcp-bucket gs://necstlab-sandbox --dataset-directory dataset-small_2cropClass_3class_VarMinPx_final/validation --model-id segmentation-model-small-3class_20200317T025254Z --batch-size 16 --optimizing-class-metric iou_score_1H --dataset-downsample-factor 0.1

    image

    batch_size: 16
    created_datetime: 20200317T042345Z
    dataset_config:
    class_annotation_mapping:
    class_0_annotation_GVs: [100]
    class_1_annotation_GVs: [175]
    class_2_annotation_GVs: [250]
    dataset_split:
    test: [8bit_AS4_S2_P1_L6_2560_1750_2160]
    train: [THIN_REF_S2_P1_L3_2496_1563_2159]
    validation: [THIN_CNT_S2_P1_L4_2334_1578_2159]
    image_cropping: {num_per_image: 1, type: random}
    stack_downsampling: {num_skip_beg_slices: 0, num_skip_end_slices: 0, number_of_images: 100,
    type: linear}
    target_size: &id001 [512, 512]
    dataset_directory: dataset-small-3class/test
    elapsed_minutes: 2.5
    gcp_bucket: gs://necstlab-sandbox
    git_hash: cf6723bb22ac87c5aba080f92ac63aa12c0b7ddf
    model_id: segmentation-model-small-3class_20200317T025254Z
    num_classes: 3
    prediction_thresholds_optimized:
    class_0: {fun: 0.9994456424028613, message: Solution found., nfev: 11, status: 0,
    success: true, x: 0.006806242506865075}
    class_1: {fun: 0.9998764211632079, message: Solution found., nfev: 12, status: 0,
    success: true, x: 0.3262379212492639}
    class_2: {fun: 1.0, message: Solution found., nfev: 11, status: 0, success: true,
    x: 0.9952027292893504}
    target_size: *id001
    threshold_optimization_configuration:
    opt_bounds: [0, 1]
    opt_class_metric: iou_score_1H
    opt_dataset_downsample_factor: 0.15
    opt_dataset_generator: tmp/datasets/dataset-small-3class/test
    opt_method: bounded
    opt_options: {disp: true, maxiter: 1000}
    opt_tol: 0.01
    train_config:
    batch_size: 16
    data_augmentation: {random_90-degree_rotations: true}
    dataset_id: dataset-small-3class
    epochs: 10
    loss: cross_entropy
    model_id_prefix: segmentation-model-small-3class
    optimizer: adam
    segmentation_model:
    model_name: Unet
    model_parameters: {backbone_name: vgg16, encoder_weights: null}
    training_data_shuffle_seed: 1234
  2. Test (or infer) using specified fit thresholds metadata (i.e., not the default optim thresholds from train/val)

    python3 test_segmentation_model.py --gcp-bucket gs://necstlab-sandbox --dataset-id dataset-small-3class --model-id segmentation-model-small-3class_20200317T025254Z --batch-size 16 --fit-metadata-root-path fit_thresholds_segmentation-model-small-3class_20200317T025254Z_dataset-small_2cropClass_3class_VarMinPx_final_iou_score_1H/metadata_fit_thresholds_output_20200317T033258Z.yaml
    python3 infer_segmentation.py --gcp-bucket gs://necstlab-sandbox --stack-id 8bit_AS4_S2_P1_L6_2560_1750_2160 --model-id segmentation-model-small-3class_20200317T025254Z --image-ids 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif --labels-output False --pad-output False --fit-metadata-root-path fit_thresholds_segmentation-model-small-3class_20200317T025254Z_dataset-small_2cropClass_3class_VarMinPx_final_iou_score_1H/metadata_fit_thresholds_output_20200317T033258Z.yaml
    background_class_index: null
    created_datetime: 20200317T034910Z
    elapsed_minutes: 2.0
    gcp_bucket: gs://necstlab-sandbox
    git_hash: cf6723bb22ac87c5aba080f92ac63aa12c0b7ddf
    image_ids: 8bit_AS4_S2_P1_L6_2560_1750_2160-2089.tif
    labels_output: false
    loaded_optimized_class_thresholds: {class_0: 0.039239138539631235, class_1: 0.9952027292893504,
    class_2: 0.4078724984571376}
    model_id: segmentation-model-small-3class_20200317T025254Z
    pad_output: false
    prediction_thresholds_used: '[0.03923914 0.99520273 0.4078725 ]'
    stack_id: 8bit_AS4_S2_P1_L6_2560_1750_2160
    threshold_metadata_root_path: fit_thresholds_segmentation-model-small-3class_20200317T025254Z_dataset-small_2cropClass_3class_VarMinPx_final_iou_score_1H/metadata_fit_thresholds_output_20200317T033258Z.yaml
    user_specified_prediction_thresholds: null
Josh-Joseph commented 4 years ago

suggest changing to one file called model-thresholds_<DATETIME>Z.yaml

thresholds: 
  class_0: 0.999891996383667
  class_1: 0.9952027292893504
  class_2: 0.9952027292893504
metadata:
   created_datetime: 20200317T025722Z
   current_global_threshold_for_reference: 0.5
   dataset_config:
     class_annotation_mapping:
       class_0_annotation_GVs: [100]
       class_1_annotation_GVs: [175]
       class_2_annotation_GVs: [250]
  dataset_split:
    test: [8bit_AS4_S2_P1_L6_2560_1750_2160]
    train: [THIN_REF_S2_P1_L3_2496_1563_2159]
    validation: [THIN_CNT_S2_P1_L4_2334_1578_2159]
  image_cropping: {num_per_image: 1, type: random}
  stack_downsa
  ...
  optimizer_output:
    <all output goes here>
Josh-Joseph commented 4 years ago

in train thresholds if any fail error out

Josh-Joseph commented 4 years ago

remove auto-threshold from train segmentation model

In train workflow:

rak5216 commented 4 years ago

UPDATED SAMPLE RESULTS AFTER CHANGES DONE

rak5216 commented 4 years ago

@Josh-Joseph ready to review again

Josh-Joseph commented 4 years ago

'[0.05572809 0.65764966 0.65764966]' -> list(prediction_threshold)

(instead of str(prediction_threshold))