Open Jupanlee opened 4 years ago
I found tensorrt model is slower than normal saved model when I use combined nms.
Looks like tensorrt is slightly faster on my Titan V GPU for full models (both using FP32)
I added instructions here: https://github.com/google/automl/tree/master/efficientdet#3-export-savedmodel-frozen-graph-tensort-models-or-tflite
Sorry. There are some bugs in converting
On Wed, May 27, 2020 at 2:04 AM Mingxing notifications@github.com wrote:
Looks like tensorrt is slightly faster on my Titan V GPU for full models (both using FP32)
I added instructions here: https://github.com/google/automl/tree/master/efficientdet#3-export-savedmodel-frozen-graph-tensort-models-or-tflite
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/automl/issues/331#issuecomment-634186062, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFGHC5K2BBTVXWWGAYCNULRTQACRANCNFSM4MTPLDVQ .
It looks like TensorRT indeed speeds up a little bit. Here is the run on a V100:
$ python model_inspect.py --runmode=bm
Per batch inference time: 0.010065060993656515
FPS: 99.3535956344674
$ python model_inspect.py --runmode=bm --tensorrt=FP32
Per batch inference time: 0.007458997704088688
FPS: 134.0662699831433
I might not use the tensorRT in the right way. @itsliupeng seems to get better results with tensorRT: https://github.com/google/automl/pull/299