inference latency - Githubissues

It's python inference. We are learning to port the model to tensorrt ourselves. You can check this superb piece of work for some quick benchmark in tensorrt. I believe he used a parser to parse our net, and since EfficientNet(Det) was fully parsed years ago, I doubt there's any performance drop compared to rebuilding HybridNets in tensorrt API.
Unfortunately, we only benchmarked inference time. Nevertheless, just as other one-stage object detectors, pre+postprocessing should only take a fraction of a full pass.

datvuthanh / HybridNets