triton-server Search Results

1000+ results
for triton-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

triton-inference-server/server #5393

add --version info option to tritonserver

**Is your feature request related to a problem? Please describe.** No command line argument can currently cause `tritonserver` binary to display version info and exit (without actually starting the s…

mirekphd updated 1 year ago
2
microsoft/onnxruntime #18743

Memory allocation failures due to incorrect requested buffer…

### Describe the issue I'm using NVIDIA Triton to perform inference on various detection models using the onnxruntime and this has always worked fine, but once I upgraded from version 1.13.1 of the…

OvervCW updated 7 months ago
4
triton-inference-server/tensorrtllm_backend #256

How to make the server call tensorrt_llm/examples/run.py?

I've followed the instruction https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/docs/baichuan.md to run Baichuan2-7b-Chat. But for exactly the same engine, the outputs are …

shil3754 updated 5 months ago
4
triton-inference-server/tensorrtllm_backend #291

About CUM_LOG_PROBS and OUTPUT_LOG_PROBS

``` { name: "OUT_CUM_LOG_PROBS" data_type: TYPE_FP32 dims: [ -1 ] }, { name: "OUT_OUTPUT_LOG_PROBS" data_type: TYPE_FP32 dims: [ -1, -1 ] } ``` I get the o…

callmezhangchenchenokay updated 5 months ago
6
netease-youdao/QAnything #8

[BUG] qanything-container-local启动报错

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答？ | Is there an existing ans…

xzkxzk12301230 updated 8 months ago
6
NVIDIA/TensorRT-LLM #2020

[Lookahead] UNAVAILABLE: Internal: unexpected error when cre…

### System Info PyTorch version: 2.3.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 OS: Ubuntu 22.04.4 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang v…

deepindeed2022 updated 2 days ago
2
sgl-project/sglang #1030

[Bug] OOM for concurrent long requests

### Checklist - [X] 1. I have searched related issues but cannot get the expected help. - [X] 2. The bug has not been fixed in the latest version. - [X] 3. Please note that if the bug-related issue y…

hahmad2008 updated 1 month ago
7
triton-inference-server/server #5961

Allow introspection and static analysis of `pb_utils` (Pytho…

**Is your feature request related to a problem? Please describe.** When writing the `model.py` file for a Python backend model, it is very difficult to correctly use `triton_python_backend_utils` (ak…

ClaytonJY updated 1 month ago
4
triton-inference-server/server #5236

[RFC] Provide an option to start any backend out-of-proc to …

**Is your feature request related to a problem? Please describe.** (This is a high-level thought and a feature request, I will update this thread if I can gather more specific data) 1. Currently, …

nikhil-sk updated 1 year ago
3
PaddlePaddle/FastDeploy #2136

Triton backend如何选择

[FastDeploy 服务化部署](https://github.com/PaddlePaddle/FastDeploy/blob/develop/serving/README_CN.md)中提到的[PaddleDetection](https://github.com/PaddlePaddle/FastDeploy/blob/develop/examples/vision/detection/…

firedent updated 7 months ago
1

上一页 1...84 85 86 87 88 89 90...100 下一页

1000+ results for triton-server

1000+ results
for triton-server