The Triton Inference Server provides an optimized cloud and edge inferencing solution.
BSD 3-Clause "New" or "Revised" License
8.39k
stars
1.49k
forks
source link
floating point exception with Triton version 24.07 when loading tensorrt_llm backend models #7556
Closed
janpetrov closed 2 months ago
Please see https://github.com/triton-inference-server/tensorrtllm_backend/issues/579
The issue seems to be specifically related to tensorrt_llm backend and Triton server version 24.07