microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.77k stars 2.94k forks source link

[TensorRT EP] Use TRT/CUDA/ORT version from runtime instead of build time to generate hash value #22921

Open chilo-ms opened 8 hours ago

chilo-ms commented 8 hours ago

Use TensorRT, CUDA and ORT version fetched at runtime to get the hash value which determines the cache name.

The old way to get the version is at compile/build time that might have some issues in some cases, ex: TRT EP uses the TRT version which we or users built against at compile time. However, users can change different TRT version at run time, that can cause issue because TRT EP always checks the "fixed" TRT version, not the TRT version it uses now. This can cause TRT EP to use incompatible TRT engine cache.

see the github issue here: https://github.com/microsoft/onnxruntime/issues/22382#issuecomment-2404140754