Open alexjaw opened 2 years ago
Thanks for reaching out.
But your config does not include these params. Isn't that rather non-traditional for a fine-tuning pipeline.config?
Absoluteyly, it is. But if you see the original config file for the MobileDet variant (CPU) I used in the Colab, you'd notice it too does not expose any such parameters. I should have checked this with the library authors but at the time, I didn't. However, @khanhlvg did review the notebook briefly and everything seemed okay at that point.
Regarding your training not converging, I think it's to do with the dataset and the complexity of the model. As far as I know, the MobileDet variant I used in the notebook is simpler than the SSD_MobileNet variant you are using. MobileDet uses Tucker convs which is likely very suitable for the kind of objects we're dealing with here.
However, if using SSD_MobileNet is a necessity for your project, I would start tweaking the anchor_generator
parameter in the config file to find a sweet spot.
Sorry I could not be more helpful.
Tested a training round (2000 steps) with your config but added
What I can see the mAP is comparable with the results obtained with the original config
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.466
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.776
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.457
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.209
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.480
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.372
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.576
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.630
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.625
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.637
...
Loss for final step: 0.9625086
Results for the apple image...a banana... However, it does detect apples on other images in the test dataset.
{'bounding_box': array([0.16773474, 0.21534976, 0.9148209 , 0.7250782 ], dtype=float32), 'class_id': 1.0, 'score': 0.55632144}
A question: Where does the number 117 comes from for num_examples parameter in the config?
mAP results for original config in the tutorial:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.489
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.785
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.521
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.355
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.502
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.358
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.611
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.627
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.600
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.633
A question: Where does the number 117 comes from for num_examples parameter in the config?
It's likely the number of evaluation examples as you can see here: https://github.com/sayakpaul/E2E-Object-Detection-in-TFLite/blob/master/colab_training/Fruits_Detection_Data_Prep.ipynb
It's likely the number of evaluation examples as you can see here: https://github.com/sayakpaul/E2E-Object-Detection-in-TFLite/blob/master/colab_training/Fruits_Detection_Data_Prep.ipynb
Number of jpg files in the test folder is 60 and 240 in the train folder... (for the dataset from https://www.kaggle.com/mbkinaci/fruit-images-for-object-detection)
Thank's for a great tutorial! I have tried to do a similar thing (with fruits dataset) but with the following models (TF1 OD model zoo) ssdlite_mobilenet_v2_coco_2018_05_09 and ssd_inception_v2_coco_2018_01_28 (also training with TF 1.15.2). However, I have problems getting any descent learning and detection from the training. When running your colab, its remarkable how fast the training converges, already after 1000 steps IoU has increased substantially and loss is around 1. I have run 11000 steps with ssdlite_mobilenet_v2_coco_2018_05_09 without coming near such figures. Another thing that I do not understand is the settings in your pipeline.config file. Normally, for fine_tuning, the config file includes "fine_tune_checkpoint", as well as "fine_tune_checkpoint_type". But your config does not include these params. Isn't that rather non-traditional for a fine-tuning pipeline.config?
my colab