-
**Is your feature request related to a problem? Please describe.**
No command line argument can currently cause `tritonserver` binary to display version info and exit (without actually starting the s…
-
### Describe the issue
I'm using NVIDIA Triton to perform inference on various detection models using the onnxruntime and this has always worked fine, but once I upgraded from version 1.13.1 of the…
-
I've followed the instruction
https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/docs/baichuan.md
to run Baichuan2-7b-Chat.
But for exactly the same engine, the outputs are …
-
```
{
name: "OUT_CUM_LOG_PROBS"
data_type: TYPE_FP32
dims: [ -1 ]
},
{
name: "OUT_OUTPUT_LOG_PROBS"
data_type: TYPE_FP32
dims: [ -1, -1 ]
}
```
I get the o…
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing ans…
-
### System Info
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang v…
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
- [X] 3. Please note that if the bug-related issue y…
-
**Is your feature request related to a problem? Please describe.**
When writing the `model.py` file for a Python backend model, it is very difficult to correctly use `triton_python_backend_utils` (ak…
-
**Is your feature request related to a problem? Please describe.**
(This is a high-level thought and a feature request, I will update this thread if I can gather more specific data)
1. Currently, …
-
[FastDeploy 服务化部署](https://github.com/PaddlePaddle/FastDeploy/blob/develop/serving/README_CN.md)中提到的[PaddleDetection](https://github.com/PaddlePaddle/FastDeploy/blob/develop/examples/vision/detection/…