NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
13.53k stars 3.23k forks source link

[UNet2D_Industrial/Tensorflow] Which metric does "Accuracy" refer to? #1047

Open notabigfish opened 2 years ago

notabigfish commented 2 years ago

Related to UNet2D_Industrial/Tensorflow

Description Hi, after evaluating the model, we found that the following metrics show on the terminal: "eval.IoU_THS_0.05" "eval.IoU_THS_0.125" "eval.IoU_THS_0.25" "eval.IoU_THS_0.5" "eval.IoU_THS_0.75" "eval.IoU_THS_0.85" "eval.IoU_THS_0.95" "eval.IoU_THS_0.99" "eval.true_positives" "eval.true_negatives" "eval.false_positives" "eval.false_negatives" "eval.true_positive_rate" "eval.true_negative_rate"

which one of these metrics does "Accuracy-FP32" in https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Segmentation/UNet_Industrial#training-accuracy-nvidia-dgx-1-8x-v100-16gb refer to? Thank you

mmarcinkiewicz commented 2 years ago

Hello @notabigfish ! We use eval.IoU_THS_0.99 as an accuracy metric for this model. Do you see any problems with it's convergence?

zhangqirui commented 2 years ago

hello @mmarcinkiewicz We train Unet2d six times with hvd 4 cards and bs=4, but the accuracy fluctuates greatly. See the data below. bash UNet_FP32_4GPU_XLA.sh eval.IoU_THS_0.99 : 0.971 eval.IoU_THS_0.99 : 0.958 eval.IoU_THS_0.99 : 0.954 eval.IoU_THS_0.99 : 0.976 eval.IoU_THS_0.99 : 0.973 eval.IoU_THS_0.99 : 0.933