issues
search
triton-inference-server
/
server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.37k
stars
1.49k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
How to free multiple gpu memory
#7825
1120475708
opened
5 hours ago
1
Suggestion on optimizing inference when model output size is large.
#7824
zmy1116
opened
13 hours ago
0
Unknown TensorRT-LLM model endpoint when using --model-namespacing=true
#7823
MatteoPagliani
opened
20 hours ago
0
Lock grpcio version
#7822
mc-nv
closed
17 hours ago
0
ci: modifying stat count for `L0_server_status` (#7820)
#7821
mc-nv
closed
21 hours ago
0
ci: modifying stat count for `L0_server_status`
#7820
KrishnanPrash
closed
1 day ago
0
fix: Default max tokens to None for OpenAI frontend.
#7819
thealmightygrant
opened
2 days ago
0
Triton Server Utilizes Only One GPU Despite Two GPUs Available on Node
#7818
jmarchel7bulls
opened
2 days ago
0
Can triton server support trace_id generator config?
#7817
stknight43
opened
2 days ago
0
test: Increase measurement-interval
#7816
krishung5
closed
2 days ago
0
fix: Fix L0_input_validation (#7800)
#7814
krishung5
closed
2 days ago
0
Model Analyzer Fails to Connect to Triton Server ([StatusCode.UNAVAILABLE] failed to connect to all addresses)
#7813
goudemaoningsir
opened
3 days ago
0
build: Support RHEL ORT TensorRT Execution Provider
#7812
fpetrini15
closed
2 days ago
0
build: Update OpenVINO model generation version
#7811
yinggeh
opened
3 days ago
0
failed to allocate pinned system memory: no pinned memory pool, falling back to non-pinned system memory
#7809
IceHowe
closed
3 days ago
1
Removing maven installation as it causes side package installation
#7808
mc-nv
closed
3 days ago
0
docs: Re-structure User Guides for Discoverability
#7807
statiraju
opened
5 days ago
0
Change torch versions
#7806
mc-nv
closed
3 days ago
0
test: Fix L0_model_update test
#7805
krishung5
closed
6 days ago
1
InferenceResponse error code is lost in Python BLS
#7804
ShuaiShao93
opened
6 days ago
1
test: Follow up PR for L0_dyna_implicit_state. Fix error message for L0_response_cache test
#7803
krishung5
closed
6 days ago
1
fix: Fix L0_onnx_execution_provider
#7802
yinggeh
closed
6 days ago
0
test: Fix L0_dyna_implicit_state--base
#7801
krishung5
closed
6 days ago
1
fix: Fix L0_input_validation
#7800
pskiran1
closed
3 days ago
0
Error about driver version compatibility
#7798
GLW1215
opened
1 week ago
2
Update model generation scenario (#7793)
#7797
mc-nv
closed
1 week ago
0
Problems with the response of the OpenAI-Compatible Frontend for Triton Inference Server
#7796
DimadonDL
opened
1 week ago
4
Triton server receives Signal (11) when tracing is enabled with no sampling (or a small sampling rate)
#7795
nicomeg-pr
opened
1 week ago
5
ensemble multi-GPU
#7794
xiazi-yu
opened
1 week ago
2
Update model generation scenario
#7793
mc-nv
closed
1 week ago
0
有人遇到过yolov8n.pt模型转torchscripts和onnx,在triton server或Deepytorch Inference上推理,精度下降的问题吗?
#7792
JackonLiu
opened
1 week ago
0
test: Fix tests for ubuntu 24.04. upgrade
#7791
krishung5
closed
1 week ago
0
test: Fix L0_backend_python for Ubuntu 24.04 base
#7789
kthui
closed
1 week ago
0
test: RHEL Filesystem Tests
#7788
fpetrini15
closed
1 week ago
0
fix: Resolve integer overflow in Load API file decoding
#7787
pskiran1
closed
9 hours ago
0
Triton x vLLM backend GPU selection issue
#7786
Tedyang2003
opened
1 week ago
2
Update ONNX version for generated models
#7785
mc-nv
closed
1 week ago
0
tritonserver is 40x slower than `TensorRT-LLM/examples/run.py`
#7784
ShuaiShao93
closed
1 week ago
1
Enable support for Ubuntu 24.04
#7783
mc-nv
closed
1 week ago
0
Update README banner
#7781
mc-nv
closed
1 week ago
0
Update README and versions for 2.52.0 / 24.11
#7780
mc-nv
closed
1 week ago
0
test: OpenAI frontend invalid chat tokenizer network issue WAR
#7779
kthui
closed
1 week ago
1
Constrained Decoding with Python backend and BLS
#7778
MatteoPagliani
closed
21 hours ago
4
Example of using Ragged Batching with FasterTransformer / TRT-LLM for zero-padding BERT inference ("continuous batching")
#7777
vadimkantorov
opened
2 weeks ago
0
Unpredictability in Sequence batching
#7776
arun-oai
opened
2 weeks ago
0
feat: Adding RestrictedFeatures Support to the Python Frontend Bindings
#7775
KrishnanPrash
opened
2 weeks ago
0
Dynamic batching from bls not working.
#7774
gerasim13
closed
2 weeks ago
1
Update 'main' to track development of 2.53.0 / 24.12
#7771
mc-nv
closed
2 weeks ago
0
fix: Skip copyrights check for "expected" files in L0_model_config
#7770
yinggeh
closed
2 weeks ago
0
fix: Adding copyright support for `.pyi` files
#7769
KrishnanPrash
closed
2 weeks ago
0
Next