-
Hello,
I have trained a model in mmsegmentation. (Pointrend)
I can use this model to inference with jit inference. When I send to inference request to Triton inference server, I got an error.
…
-
Hello, curious if we can already use sglang as a backend for NVIDIA's Triton Server.
Amazing work with the library btw, love it!
-
```
(app-py3.10) (base) apple@mac funasr_server % poetry add triton@2.2.0
Updating dependencies
Resolving dependencies... (3.6s)
Package operations: 1 install, 0 updates, 0 removals
- Ins…
-
**Description**
I have been trying to build Triton Core from source in Windows 10 using these commands as mentioned in the README file for Triton Core at https://github.com/triton-inference-server/co…
-
**Description**
While building from source, the build fails when tensorrt_llm backend is chosen.
**Triton Information**
What version of Triton are you using? r24.04
Are you using the Triton co…
-
## Describe the bug
I can not expose triton metrics in deployment - i put ports dsecribtion at Pod.v1 spec and use Triton implementation, but metrics ports can not be recognized.
Triton serv…
-
**Description**
I run benchmark of Meta-Llama-3-8B-Instruct in RTX 8*4090,
![image](https://github.com/triton-inference-server/server/assets/68674291/1a0fd341-8d8f-4893-973c-ed1ed3b74aca)
when r…
-
I use image nvcr.io/nvidia/tritonserver:23.09-py3-min to build triton ;
I used the following image nvcr.io/nvidia/tritonserver:23.09-py3-min to build triton to compile and install triton. The com…
-
I got the same problem on 21.11 and 21.12, it works with the single model or a couple of models, but triton never releases them.
Ensemble model: Python backend(cpu) + onnx model(GPU)…
-
The server seems to be ok with the following log.
```
I1212 03:29:51.067415 37860 server.cc:674]
+----------------+---------+--------+
| Model | Version | Status |
+----------------+---…