david8862 / keras-YOLOv3-model-set

end-to-end YOLOv4/v3/v2 object detection pipeline, implemented on tf.keras with different technologies
MIT License
639 stars 222 forks source link

Average Inference time does not meet expectations, mnn inference results are inconsistent #96

Open grapefruitL opened 4 years ago

grapefruitL commented 4 years ago

Hello! Thank you very much for providing such a complete and powerful yolo project!

My original intention was to test the inference time on both keras/tensorflow and mnn on linux If it goes well, I will continue test the inference time on keras/tensorflow with cuda and mnn with vulcan

but if anyone knows the qualitative conclusion of the above test or the quantitative conclusion based on a specific hardware environment, you can let me know by the way!

The following is my debugging process and output:

download yolo3tiny weight and convert it to h5 file wget -O weights/yolov3-tiny.weights https://pjreddie.com/media/files/yolov3-tiny.weights python tools/model_converter/convert.py cfg/yolov3-tiny.cfg weights/yolov3-tiny.weights weights/yolov3-tiny.h5

Model dump python yolo.py --dump_model --output_model_file=weights/dump_model.h5

create pb file python tools/model_converter/keras_to_tensorflow.py --input_model='weights/dump_model.h5' --output_model='weights/dump_model.pb'

convert pb to mnn file ./MNNConvert -f TF --modelFile dump_model.pb --MNNModel dump_model.pb.mnn --bizCode MNN

test: test1 python tools/evaluation/validate_yolo.py --model_path=weights/yolov3-tiny.h5 --anchors_path=configs/tiny_yolo3_anchors.txt --classes_path=configs/coco_classes.txt --image_file=example/dog.jpg --loop_count=5

Average Inference time: 47.40839005ms PostProcess time: 11.89804077ms Found 4 boxes for example/dog.jpg Class: car, Score: 0.7165896404239014, Box: (467, 70),(677, 170) Class: car, Score: 0.6319490424682854, Box: (534, 96),(621, 157) Class: dog, Score: 0.5913762577150123, Box: (132, 189),(369, 514) Class: bicycle, Score: 0.5028871783535749, Box: (204, 152),(577, 449)

test2 python tools/evaluation/validate_yolo.py --model_path=weights/dump_model.h5 --anchors_path=configs/tiny_yolo3_anchors.txt --classes_path=configs/coco_classes.txt --image_file=example/dog.jpg --loop_count=5

Average Inference time: 48.08216095ms PostProcess time: 12.18938828ms Found 4 boxes for example/dog.jpg Class: car, Score: 0.7165896404239014, Box: (467, 70),(677, 170) Class: car, Score: 0.6319490424682854, Box: (534, 96),(621, 157) Class: dog, Score: 0.5913762577150123, Box: (132, 189),(369, 514) Class: bicycle, Score: 0.5028871783535749, Box: (204, 152),(577, 449)

test3 python tools/evaluation/validate_yolo.py --model_path=weights/dump_model.pb --anchors_path=configs/tiny_yolo3_anchors.txt --classes_path=configs/coco_classes.txt --image_file=example/dog.jpg --loop_count=5

Average Inference time: 344.10028458ms PostProcess time: 11.88158989ms Found 4 boxes for example/dog.jpg Class: car, Score: 0.7165896404239014, Box: (467, 70),(677, 170) Class: car, Score: 0.6319490424682854, Box: (534, 96),(621, 157) Class: dog, Score: 0.5913762577150123, Box: (132, 189),(369, 514) Class: bicycle, Score: 0.5028871783535749, Box: (204, 152),(577, 449)

test4 python tools/evaluation/validate_yolo.py --model_path=weights/dump_model.pb.mnn --anchors_path=configs/tiny_yolo3_anchors.txt --classes_path=configs/coco_classes.txt --image_file=example/dog.jpg --loop_count=5

Average Inference time: 139.23211098ms output tensor name: prediction_13/BiasAdd, shape: (1, 255, 13, 13) output tensor name: prediction_26/BiasAdd, shape: (1, 255, 26, 26) PostProcess time: 13.67926598ms Found 0 boxes for example/dog.jpg

my question is:

  1. Why didn't the model dump operation increase the speed of inference? (Comparative test 1 and 2)
  2. Why did the inference speed decrease when converted to pb model? (Comparative Test 2 and Test 3)
  3. Can we conclude that the processing speed of mnn is not as good as keras by comparing tests 1, 2 and 4?
  4. Why does the mnn model fail to detect? (mnn model did not found any object in test 4)
david8862 commented 4 years ago

@grapefruitL

  1. model dump is for stripping out loss layer during training. So it wouldn't change inference speed.
  2. No idea for the root cause. Maybe need to check the .h5 and .pb graph
  3. yes from result but it may be also impact by your HW env or TF/MNN version.
  4. It should be a MNN bug for some OP inference
grapefruitL commented 4 years ago

thanks for the reply I check the .h5 and .pb graph by tensorboard, they are the same I'm tring to find the reason from the model definition level