Closed kingardor closed 5 years ago
Have you calibrate the graph? In case you haven't, see this link (near end of the article).
Thank you so much. Will give it a shot and update here :)
OKay, I checked out the link. I will prepare a dataset for calibration. Meanwhile, you set the max batch size in create_inference_graph() method. How do we use this batch size during inference?
Thanks buddy. Checked it out. Also, I wanted help regarding batch inference. You mentioned max batches in the create inference graph method(). How do I feed a batch of images to the model?
On Wed, 3 Apr 2019 at 06:32, Ardian Umam notifications@github.com wrote:
Have you calibrate the graph? In case you haven't, see this link https://devblogs.nvidia.com/tensorrt-integration-speeds-tensorflow-inference/ (near end of the article).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ardianumam/Tensorflow-TensorRT/issues/9#issuecomment-479279897, or mute the thread https://github.com/notifications/unsubscribe-auth/AUN02WCrbnHAU7uMHKeLrXfZezVHERWkks5vc_3CgaJpZM4cY48k .
Nevermind. I figured it out. I froze the graph again with the input tensor shape as [None, 416, 416, 3]. This allows for batch inference.
So I tried using INT8 instead of FP16 for optimizing YOLOv3. Instead of getting a speedup, it was taking 1200+ ms per image.
My environment: Ubuntu 18.10 Python 3.7.1 CUDA 10.0 cudNN 7.5.0 Tensorflow-gpu 1.13.1 TensorRT 5.0.2.6 GTX 1070