Open noahzn opened 3 days ago
Hi @noahzn Your old engine/profile might not be reused by TRTEP if current inference param/cache name/env variables/HW env changes.
Here's more info about engine reusability: https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#trt_engine_cache_enable
I wonder if you update your old engine/profile with newly generated ones, is that new engine going to be reused? or a newer engine need to be generated
@yf711 Thanks for your reply!
My networks are keypoints detection and matching. I think the issue is that we cannot guarantee to extract the same numbers of keypoints on both images. I have warmed up the networks using about 10k paired of images, but it still generates new engines for some paired of images. The old generated engines are still used I think, because it indeed accelerates the inference.
What can I do in this case? will trt_profile_min_shapes
and trt_profile_max_shapes
help? I tried setting this for input dimensions, but it's not enough.
Following input(s) has no associated shape profiles provided: /Reshape_3_output_0,/norm/Div_output_0,/Resize_output_0,/Unsqueeze_18_output_0,/NonZero_output_0
. Maybe some intermediate layers also need to be given dimension ranges?
I have already generated some trt cache when infering my ONNX model using TRT Execution Provider. Then, for the online testing of my model, I set
so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL
, but it seems that still new caches are generated. I only want to reuse the old cache while not generating new cache. How can I do that? Thanks in advance!