TRT 8 doesn't support INT64 and DOUBLE data type.
TRT 10 doesn't support DOUBLE data type.
Therefore, TRT EP internally needs to convert INT64 to INT32, and DOUBLE to FLOAT, which needs the cuda::Impl_Cast function.
The implementation is copied from CUDA EP.
TRT 8 doesn't support INT64 and DOUBLE data type. TRT 10 doesn't support DOUBLE data type.
Therefore, TRT EP internally needs to convert INT64 to INT32, and DOUBLE to FLOAT, which needs the cuda::Impl_Cast function. The implementation is copied from CUDA EP.