-
Trying under Windows here (adding to CogVideoX as per your demo script).
```
File "D:\CogVideoX\CogVideo\venv\lib\site-packages\triton\runtime\build.py", line 52, in _build
raise RuntimeErr…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and fou…
-
**Description**
After compiling the triton server using installed libraries in vcpkg , these compiled libraries will cause symbol conflicts and result in the linking failure of the client compilation…
-
I installed tensorrtllm_backend in the follow way:
1. `docker pull nvcr.io/nvidia/tritonserver:23.12-trtllm-python-py3`
2. `docker run -v /data2/share/:/data/ -v /mnt/sdb/benchmark/xiangrui:/root…
xxyux updated
4 weeks ago
-
### System Info
I am working on the benchmarking suite in vLLM team, and now trying to run TensorRT-LLM for comparison. I am relying on this github repo (https://github.com/neuralmagic/tensorrt-demo)…
-
### System Info
arch - x86-64
gpu - rtx3070
docker image nvcr.io/nvidia/tritonserver:24.01-trtllm-python-py3
tensorRT-LLM-backend tag - 0.7.2
tensorRT-LLM tag - 0.7.1 (80bc07510ac4ddf13c0d76ad2…
-
Scenario:
* I am hosting the paddleocr in triton server via the python backend.
* I packed paddleocr and all its dependencies to a tar.gz file following this instruction.
https://github.com/tri…
-
@Tabrizian In [this](https://github.com/triton-inference-server/client/blob/main/src/python/examples/simple_grpc_infer_client.py) example (line 131) they pass a python dict in the `headers` arg of `t…
-
**Description**
I have a 5 steps ensemble pipeline for triton.
* 3 steps are torchscript artifacts
* 2 steps are tensorrt compiled models
in pbtxts files I have
```
instance_group [{ kind: KIN…
-
![image](https://github.com/user-attachments/assets/b2fbbab3-1cc8-4160-b446-b7e09b8089e7)
any suggestions?
11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz 8 16
I…