How to improve tiny performance

hhk7734 / tensorflow-yolov4

YOLOv4 Implemented in Tensorflow 2.

MIT License

136 stars 75 forks source link

How to improve tiny performance #23

Closed ankandrew closed 3 years ago

ankandrew commented 4 years ago

Hello, first thanks for your work.

I am running yolo v4 tiny but seeing not very good performance. I am timing the predict function, but get worse results than what I expected. I created this Colab to replicate the results.

Results (best) are 42.7 ms or 23 FPS per inference w/ batch=1. How can I get better results?

Thanks

hhk7734 commented 4 years ago

I modified part of your colab script as shown below. Then, run.

input_data = np.random.random(size=(1,608, 608, 3))

%timeit -n 100 yolo.model.predict(input_data)
100 loops, best of 3: 38.6 ms per loop

And, I removed the below https://github.com/hhk7734/tensorflow-yolov4/blob/0e73c11ebdfc22f6776b49f6d4dd53415592b585/py_src/yolov4/model/yolov4.py#L128

Then, run.

%timeit -n 100 yolo.model.predict(input_data)
100 loops, best of 3: 37.7 ms per loop

So, Of 42.7 ms, 37.7 ms is the execution time of the model itself.

hhk7734 commented 4 years ago

After checking the performance of hunglc007's yolov4, PyTorch-yolov4, darknet-yolov4, etc. under the same conditions, I should check if this library is slow, or language-dependent, and then, find optimization points.

But, I have a lot of work these days, so the test is getting late. :(

ankandrew commented 4 years ago

I modified part of your colab script as shown below. Then, run.

Yes, now I notice that I was using YOLOv4.predict (which includes preprocessing) instead of directly timing model.predict function.

After checking the performance of hunglc007's yolov4, PyTorch-yolov4, darknet-yolov4, etc. under the same conditions, I should check if this library is slow, or language-dependent, and then, find optimization points.

This is true, I did notice a x4 FPS with hunglc007 implementation. (100 FPS compared to ~ 25 FPS). I will try to check and compare and comment back if I have time :)

ankandrew commented 4 years ago

@hhk7734 Hi,

I notice doing inference with tf.keras predict is very slow, I don't know the reason why. I tried exporting keras object model with tf.saved_model.save and get better results (no preprocessing, just inference w/ batch=1)

Now I get 131 FPS (7.7 ms) instead of 25 FPS (39.9 ms)

Colab to demonstrate this. I copied hunglc007 style of doing inference (not using tf.Keras)

BonVant commented 3 years ago

I think you don't need to use saved_model necessarily. Decorating and using the __call__ method of your keras model should give you similar results. My understanding is that it creates a frozen inference graph (python independant) for a given signature input on the first call, optimizing the following ones. The difference with saved_model is that you're actually creating the graph when saving.

@tf.function def test_predict(x): return yolo.model(x, training=False)

https://www.tensorflow.org/guide/function

ankandrew commented 3 years ago

@BonVant Thanks, that way is much cleaner than exporting the model. I noticed that using decorator tf.function is slightly slower tho (5 FPS difference). I updated the colab with the results.

hhk7734 commented 3 years ago

commit: 1149d7645a15c7600388fdc3af708b00505922af this is really good :) Thanks @ankandrew @BonVant