I'm using the jupyter example to create these trt-optimized graphs for use in my projects. I'm taking the TensorRT converted graph, writing it to a file, and then loading that pb file in and performing inference. However, I've noticed the runtimes I get when doing this are about 3 times greater than the runtimes reported by the notebook. Either the notebook is reporting incorrect times or somehow reconstructing the graph from the file creates a different graph than the original that is somehow slower. Has anyone been able to reproduce this issue?
I'm using the jupyter example to create these trt-optimized graphs for use in my projects. I'm taking the TensorRT converted graph, writing it to a file, and then loading that pb file in and performing inference. However, I've noticed the runtimes I get when doing this are about 3 times greater than the runtimes reported by the notebook. Either the notebook is reporting incorrect times or somehow reconstructing the graph from the file creates a different graph than the original that is somehow slower. Has anyone been able to reproduce this issue?