Open johnjunjun7 opened 4 years ago
You get the point :D Actually the mAP value in list is calculated with (currently default) conf_threshold=0.001, which will cause a higher recall rate but low precision rate. This is a common trick for object detectors to get higher mAP result. But if you want to use it in practise, a reasonable conf_threshold like 0.1 is necessary.
The result of Tiny YOLOv3 Lite-Mobilenet, which you provide in the README.md. I use the weights you provide, then I test the model on the VOC2007 with the code below:
python eval.py --model_path=tiny_yolo3_mobilnet_lite_416_voc.h5 --anchors_path=configs/tiny_yolo3_anchors.txt --classes_path=configs/voc_classes.txt --model_image_size=416x416 --eval_type=VOC --iou_threshold=0.5 --conf_threshold=0.1 --annotation_file=tools/2007_test.txt --save_result
Then I get the result below, This result is much lower than the value you gave——72.60%
Pascal VOC AP evaluation aeroplane: AP 0.7189, precision 0.7566, recall 0.7395 bicycle: AP 0.6749, precision 0.7128, recall 0.7018 bird: AP 0.5710, precision 0.7094, recall 0.6146 boat: AP 0.4352, precision 0.5123, recall 0.5293 bottle: AP 0.3755, precision 0.4835, recall 0.4460 bus: AP 0.6701, precision 0.7243, recall 0.6929 car: AP 0.6665, precision 0.6896, recall 0.6963 cat: AP 0.7911, precision 0.7103, recall 0.8216 chair: AP 0.3475, precision 0.5051, recall 0.4367 cow: AP 0.5558, precision 0.6398, recall 0.6261 diningtable: AP 0.5163, precision 0.5737, recall 0.5987 dog: AP 0.7250, precision 0.6420, recall 0.7849 horse: AP 0.7407, precision 0.7348, recall 0.7646 motorbike: AP 0.6854, precision 0.7082, recall 0.7236 person: AP 0.6756, precision 0.7062, recall 0.7128 pottedplant: AP 0.3647, precision 0.4815, recall 0.4848 sheep: AP 0.5517, precision 0.6169, recall 0.6109 sofa: AP 0.5112, precision 0.6150, recall 0.5808 train: AP 0.7596, precision 0.6649, recall 0.8146 tvmonitor: AP 0.6470, precision 0.6005, recall 0.6953 mAP@IoU=0.50 result: 59.918650 mPrec@IoU=0.50 result: 63.936128 mRec@IoU=0.50 result: 65.378780
Is there a problem with the weights you provided, or is the code I configured wrong, or do I need to continue training with the weights you provided?Looking forward to your answer :)
You get the point :D Actually the mAP value in list is calculated with (currently default) conf_threshold=0.001, which will cause a higher recall rate but low precision rate. This is a common trick for object detectors to get higher mAP result. But if you want to use it in practise, a reasonable conf_threshold like 0.1 is necessary.
The result of Tiny YOLOv3 Lite-Mobilenet, which you provide in the README.md. I use the weights you provide, then I test the model on the VOC2007 with the code below:
python eval.py --model_path=tiny_yolo3_mobilnet_lite_416_voc.h5 --anchors_path=configs/tiny_yolo3_anchors.txt --classes_path=configs/voc_classes.txt --model_image_size=416x416 --eval_type=VOC --iou_threshold=0.5 --conf_threshold=0.1 --annotation_file=tools/2007_test.txt --save_result
Then I get the result below, This result is much lower than the value you gave——72.60%Pascal VOC AP evaluation aeroplane: AP 0.7189, precision 0.7566, recall 0.7395 bicycle: AP 0.6749, precision 0.7128, recall 0.7018 bird: AP 0.5710, precision 0.7094, recall 0.6146 boat: AP 0.4352, precision 0.5123, recall 0.5293 bottle: AP 0.3755, precision 0.4835, recall 0.4460 bus: AP 0.6701, precision 0.7243, recall 0.6929 car: AP 0.6665, precision 0.6896, recall 0.6963 cat: AP 0.7911, precision 0.7103, recall 0.8216 chair: AP 0.3475, precision 0.5051, recall 0.4367 cow: AP 0.5558, precision 0.6398, recall 0.6261 diningtable: AP 0.5163, precision 0.5737, recall 0.5987 dog: AP 0.7250, precision 0.6420, recall 0.7849 horse: AP 0.7407, precision 0.7348, recall 0.7646 motorbike: AP 0.6854, precision 0.7082, recall 0.7236 person: AP 0.6756, precision 0.7062, recall 0.7128 pottedplant: AP 0.3647, precision 0.4815, recall 0.4848 sheep: AP 0.5517, precision 0.6169, recall 0.6109 sofa: AP 0.5112, precision 0.6150, recall 0.5808 train: AP 0.7596, precision 0.6649, recall 0.8146 tvmonitor: AP 0.6470, precision 0.6005, recall 0.6953 mAP@IoU=0.50 result: 59.918650 mPrec@IoU=0.50 result: 63.936128 mRec@IoU=0.50 result: 65.378780
Is there a problem with the weights you provided, or is the code I configured wrong, or do I need to continue training with the weights you provided?Looking forward to your answer :)
Thank you for your answer, but there are still a few questions:
You get the point :D Actually the mAP value in list is calculated with (currently default) conf_threshold=0.001, which will cause a higher recall rate but low precision rate. This is a common trick for object detectors to get higher mAP result. But if you want to use it in practise, a reasonable conf_threshold like 0.1 is necessary.
The result of Tiny YOLOv3 Lite-Mobilenet, which you provide in the README.md. I use the weights you provide, then I test the model on the VOC2007 with the code below:
python eval.py --model_path=tiny_yolo3_mobilnet_lite_416_voc.h5 --anchors_path=configs/tiny_yolo3_anchors.txt --classes_path=configs/voc_classes.txt --model_image_size=416x416 --eval_type=VOC --iou_threshold=0.5 --conf_threshold=0.1 --annotation_file=tools/2007_test.txt --save_result
Then I get the result below, This result is much lower than the value you gave——72.60%Pascal VOC AP evaluation aeroplane: AP 0.7189, precision 0.7566, recall 0.7395 bicycle: AP 0.6749, precision 0.7128, recall 0.7018 bird: AP 0.5710, precision 0.7094, recall 0.6146 boat: AP 0.4352, precision 0.5123, recall 0.5293 bottle: AP 0.3755, precision 0.4835, recall 0.4460 bus: AP 0.6701, precision 0.7243, recall 0.6929 car: AP 0.6665, precision 0.6896, recall 0.6963 cat: AP 0.7911, precision 0.7103, recall 0.8216 chair: AP 0.3475, precision 0.5051, recall 0.4367 cow: AP 0.5558, precision 0.6398, recall 0.6261 diningtable: AP 0.5163, precision 0.5737, recall 0.5987 dog: AP 0.7250, precision 0.6420, recall 0.7849 horse: AP 0.7407, precision 0.7348, recall 0.7646 motorbike: AP 0.6854, precision 0.7082, recall 0.7236 person: AP 0.6756, precision 0.7062, recall 0.7128 pottedplant: AP 0.3647, precision 0.4815, recall 0.4848 sheep: AP 0.5517, precision 0.6169, recall 0.6109 sofa: AP 0.5112, precision 0.6150, recall 0.5808 train: AP 0.7596, precision 0.6649, recall 0.8146 tvmonitor: AP 0.6470, precision 0.6005, recall 0.6953 mAP@IoU=0.50 result: 59.918650 mPrec@IoU=0.50 result: 63.936128 mRec@IoU=0.50 result: 65.378780
Is there a problem with the weights you provided, or is the code I configured wrong, or do I need to continue training with the weights you provided?Looking forward to your answer :)Thank you for your answer, but there are still a few questions:
- I used the 0.001 threshold for testing. The mAP obtained still does not reach the value in your list. just 66%
- In the current paper, what is the conf_threshold generally given, for example, YOLO NANO does not give a specific value. So what threshold do I need to use for comparison?
After modification, it really works. I used the training set generated by the previous voc_annotation.py to train. Will there be problems? Maybe I need to retrain?
After modification, it really works. I used the training set generated by the previous voc_annotation.py to train. Will there be problems? Maybe I need to retrain?
For training dataset, either include or not include difficult object are both ok. Maybe you can try both and check which can get better result.
After modification, it really works. I used the training set generated by the previous voc_annotation.py to train. Will there be problems? Maybe I need to retrain?
For training dataset, either include or not include difficult object are both ok. Maybe you can try both and check which can get better result.
A quick update: I've tried to train the YOLO Nano model with Imagenet pretrained backbone, and currently have got 64.95 mAP. Training is still going on.
After modification, it really works. I used the training set generated by the previous voc_annotation.py to train. Will there be problems? Maybe I need to retrain?
For training dataset, either include or not include difficult object are both ok. Maybe you can try both and check which can get better result.
A quick update: I've tried to train the YOLO Nano model with Imagenet pretrained backbone, and currently have got 64.95 mAP. Training is still going on.
I got results similar to yours, mAP=67.8% with the conf_threshlod=0.001。 I am trying to retrain using data without difficult :)
After modification, it really works. I used the training set generated by the previous voc_annotation.py to train. Will there be problems? Maybe I need to retrain?
For training dataset, either include or not include difficult object are both ok. Maybe you can try both and check which can get better result.
A quick update: I've tried to train the YOLO Nano model with Imagenet pretrained backbone, and currently have got 64.95 mAP. Training is still going on.
I got results similar to yours, mAP=67.8% with the conf_threshlod=0.001。 I am trying to retrain using data without difficult :)
Awesome! Can you provide the trained model & Imagenet pretrained backbone? I can publish them in next release.
After modification, it really works. I used the training set generated by the previous voc_annotation.py to train. Will there be problems? Maybe I need to retrain?
For training dataset, either include or not include difficult object are both ok. Maybe you can try both and check which can get better result.
A quick update: I've tried to train the YOLO Nano model with Imagenet pretrained backbone, and currently have got 64.95 mAP. Training is still going on.
I got results similar to yours, mAP=67.8% with the conf_threshlod=0.001。 I am trying to retrain using data without difficult :)
Awesome! Can you provide the trained model & Imagenet pretrained backbone? I can publish them in next release.
can you give me your email, I send them to you
After modification, it really works. I used the training set generated by the previous voc_annotation.py to train. Will there be problems? Maybe I need to retrain?
For training dataset, either include or not include difficult object are both ok. Maybe you can try both and check which can get better result.
A quick update: I've tried to train the YOLO Nano model with Imagenet pretrained backbone, and currently have got 64.95 mAP. Training is still going on.
I got results similar to yours, mAP=67.8% with the conf_threshlod=0.001。 I am trying to retrain using data without difficult :)
Awesome! Can you provide the trained model & Imagenet pretrained backbone? I can publish them in next release.
can you give me your email, I send them to you
david8862@gmail.com. Thanks a lot :)
After modification, it really works. I used the training set generated by the previous voc_annotation.py to train. Will there be problems? Maybe I need to retrain?
For training dataset, either include or not include difficult object are both ok. Maybe you can try both and check which can get better result.
A quick update: I've tried to train the YOLO Nano model with Imagenet pretrained backbone, and currently have got 64.95 mAP. Training is still going on.
I got results similar to yours, mAP=67.8% with the conf_threshlod=0.001。 I am trying to retrain using data without difficult :)
Awesome! Can you provide the trained model & Imagenet pretrained backbone? I can publish them in next release.
can you give me your email, I send them to you
david8862@gmail.com. Thanks a lot :)
I am very sorry that the data in my laboratory cannot be copied out now. :( But the results of our current training are similar, and the top5 accuracy of the ImageNet pretraining weights I use is only 74%. If you use weights as accurate as I did, you should get similar result.
There is still some gap between the accuracy we get now and that in the paper, and considering that the model in the paper is quantified, we should be able to get higher accuracy without quantifying. (I see that the test module in your code USES the method of calculating mAP of VOC2012, which is different from VOC2007. Will it also have some influence?)
After modification, it really works. I used the training set generated by the previous voc_annotation.py to train. Will there be problems? Maybe I need to retrain?
For training dataset, either include or not include difficult object are both ok. Maybe you can try both and check which can get better result.
A quick update: I've tried to train the YOLO Nano model with Imagenet pretrained backbone, and currently have got 64.95 mAP. Training is still going on.
I got results similar to yours, mAP=67.8% with the conf_threshlod=0.001。 I am trying to retrain using data without difficult :)
Awesome! Can you provide the trained model & Imagenet pretrained backbone? I can publish them in next release.
can you give me your email, I send them to you
david8862@gmail.com. Thanks a lot :)
I am very sorry that the data in my laboratory cannot be copied out now. :( But the results of our current training are similar, and the top5 accuracy of the ImageNet pretraining weights I use is only 74%. If you use weights as accurate as I did, you should get similar result.
There is still some gap between the accuracy we get now and that in the paper, and considering that the model in the paper is quantified, we should be able to get higher accuracy without quantifying. (I see that the test module in your code USES the method of calculating mAP of VOC2012, which is different from VOC2007. Will it also have some influence?)
It doesn't matter :) Actually finally I got a checkpoint with mAP=69.40 in my training, but haven't got a good solution to convert it to tflite UINT8 quantized model due to some OP support issue. So the quantized mAP is not verified yet. I'm still working on it.
VOC12 metric use continuous recall value to check the precision for the precision-recall curve, which is different from 10-points solution in VOC07. You can refer this blog for details. I think nowerdays all the PascalVOC mAP metric should have followed the new standard.
The result of Tiny YOLOv3 Lite-Mobilenet, which you provide in the README.md. I use the weights you provide, then I test the model on the VOC2007 with the code below:
python eval.py --model_path=tiny_yolo3_mobilnet_lite_416_voc.h5 --anchors_path=configs/tiny_yolo3_anchors.txt --classes_path=configs/voc_classes.txt --model_image_size=416x416 --eval_type=VOC --iou_threshold=0.5 --conf_threshold=0.1 --annotation_file=tools/2007_test.txt --save_result
Then I get the result below, This result is much lower than the value you gave——72.60%
Pascal VOC AP evaluation aeroplane: AP 0.7189, precision 0.7566, recall 0.7395 bicycle: AP 0.6749, precision 0.7128, recall 0.7018 bird: AP 0.5710, precision 0.7094, recall 0.6146 boat: AP 0.4352, precision 0.5123, recall 0.5293 bottle: AP 0.3755, precision 0.4835, recall 0.4460 bus: AP 0.6701, precision 0.7243, recall 0.6929 car: AP 0.6665, precision 0.6896, recall 0.6963 cat: AP 0.7911, precision 0.7103, recall 0.8216 chair: AP 0.3475, precision 0.5051, recall 0.4367 cow: AP 0.5558, precision 0.6398, recall 0.6261 diningtable: AP 0.5163, precision 0.5737, recall 0.5987 dog: AP 0.7250, precision 0.6420, recall 0.7849 horse: AP 0.7407, precision 0.7348, recall 0.7646 motorbike: AP 0.6854, precision 0.7082, recall 0.7236 person: AP 0.6756, precision 0.7062, recall 0.7128 pottedplant: AP 0.3647, precision 0.4815, recall 0.4848 sheep: AP 0.5517, precision 0.6169, recall 0.6109 sofa: AP 0.5112, precision 0.6150, recall 0.5808 train: AP 0.7596, precision 0.6649, recall 0.8146 tvmonitor: AP 0.6470, precision 0.6005, recall 0.6953 mAP@IoU=0.50 result: 59.918650 mPrec@IoU=0.50 result: 63.936128 mRec@IoU=0.50 result: 65.378780
Is there a problem with the weights you provided, or is the code I configured wrong, or do I need to continue training with the weights you provided?Looking forward to your answer :)