Doodleverse / doodleverse_utils

A set of common Doodleverse tools and utilities
MIT License
4 stars 3 forks source link

new Dice and IoU #6 & #7 #8

Closed ebgoldstein closed 2 years ago

ebgoldstein commented 2 years ago

Here are the new Dice and IoU..

so in gym, the model.compile() calls need to be adjusted:

for IoU as metric: iou_multi(NCLASSES)

for Dice as metric: dice_multi(NCLASSES)

for dice as loss: dice_coef_loss(NCLASSES)

All smooth / epsilon is set to 10e-6

This does not have loss weighting in the dice yet.. we can add it into this PR, or the next PR...

dbuscombe-usgs commented 2 years ago

Note that this fix will break previous Gym code because of function renaming

CameronBodine commented 2 years ago

I am attempting to test the new loss functions with 1-band imagery. I fetched new updates from main and updated doodleverse_utlis==0.0.5. After running train_model.py, I receive the following error:

(gym) cbodine@filfy-Thelio-Massive:~/PythonRepos/segmentation_gym$ python train_model.py 
2022-10-04 08:27:48.844598: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
/mnt/md0/SynologyDrive/Modeling/Substrate_BOU-LEA_5_remapHard/03_forModelTraining/dataset
/mnt/md0/SynologyDrive/Modeling/Substrate_BOU-LEA_5_remapHard/03_forModelTraining/config/substrate_20221004_v7.json
Using GPU
Using single GPU device
Version:  2.10.0
Eager mode:  True
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
Making new directory for example model outputs: /mnt/md0/SynologyDrive/Modeling/Substrate_BOU-LEA_5_remapHard/03_forModelTraining/modelOut
MODE "all": using all augmented and non-augmented files
2022-10-04 08:28:46.128074: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-04 08:28:46.862641: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14597 MB memory:  -> device: 0, name: Quadro RTX 5000, pci bus id: 0000:65:00.0, compute capability: 7.5
369
221
.....................................
Creating and compiling model ...
.....................................
Training model ...

Epoch 1: LearningRateScheduler setting learning rate to 1e-07.
Epoch 1/100
Traceback (most recent call last):
  File "/home/cbodine/PythonRepos/segmentation_gym/train_model.py", line 721, in <module>
    history = model.fit(train_ds, steps_per_epoch=steps_per_epoch, epochs=MAX_EPOCHS,
  File "/home/cbodine/anaconda3/envs/gym/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_filekmiiz2fb.py", line 15, in tf__train_function
    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
  File "/tmp/__autograph_generated_file0km3dfxo.py", line 27, in tf__mean_iou
    ag__.for_stmt(ag__.converted_call(ag__.ld(range), (ag__.ld(nclasses),), None, fscope), None, loop_body, get_state, set_state, ('iousum',), {'iterate_names': 'index'})
  File "/tmp/__autograph_generated_file0km3dfxo.py", line 25, in loop_body
    iousum += ag__.converted_call(basic_iou, (y_true[:, :, :, index], y_pred[:, :, :, index]), None, fscope)
ValueError: in user code:

    File "/home/cbodine/anaconda3/envs/gym/lib/python3.10/site-packages/keras/engine/training.py", line 1160, in train_function  *
        return step_function(self, iterator)
    File "/home/cbodine/anaconda3/envs/gym/lib/python3.10/site-packages/doodleverse_utils/model_imports.py", line 989, in mean_iou  *
        iousum += basic_iou(y_true[:,:,:,index], y_pred[:,:,:,index])

    ValueError: slice index 4 of dimension 3 out of bounds. for '{{node strided_slice_11}} = StridedSlice[Index=DT_INT32, T=DT_FLOAT, begin_mask=7, ellipsis_mask=0, end_mask=7, new_axis_mask=0, shrink_axis_mask=8](one_hot, strided_slice_11/stack, strided_slice_11/stack_1, strided_slice_11/stack_2)' with input shapes: [?,512,512,4], [4], [4], [4] and with computed input tensors: input[1] = <0 0 0 4>, input[2] = <0 0 0 5>, input[3] = <1 1 1 1>.