issues
search
triton-inference-server
/
server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.39k
stars
1.49k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Build: Updating to allow passing DOCKER_GPU_ARGS at model generation
#7566
pvijayakrish
closed
3 months ago
0
Release: Update NGC versions post-24.08 release
#7565
pvijayakrish
closed
3 months ago
0
Release: Update README for r24.08
#7564
pvijayakrish
closed
3 months ago
0
docs: Add tensorrtllm_backend into doc generation
#7563
krishung5
closed
3 months ago
0
[ERROR] No available memory for the cache blocks.
#7562
TheNha
opened
3 months ago
0
feat: OpenAI Compatible Frontend
#7561
rmccorm4
closed
1 month ago
7
test: Load new model version should not reload loaded existing model …
#7560
mc-nv
closed
3 months ago
0
ci: Raise Documentation Generation Errors
#7559
fpetrini15
closed
3 months ago
1
How is the order determined for loading a model onto a specific device?
#7558
mhbassel
closed
2 months ago
5
floating point exception with Triton version 24.07 when loading tensorrt_llm backend models
#7556
janpetrov
closed
2 months ago
1
feat: Add GRPC error codes to GRPC streaming if enabled by user. (#7499)
#7555
mc-nv
closed
3 months ago
0
Intermittent `L0_decoupled_grpc_error` crash fixed. (#7552)
#7554
mc-nv
closed
3 months ago
0
test: Load new model version should not reload loaded existing model …
#7553
kthui
closed
3 months ago
1
Intermittent `L0_decoupled_grpc_error` crash fixed.
#7552
indrajit96
closed
3 months ago
0
Build Triton and Backends On Windows
#7551
mhbassel
closed
3 months ago
4
Can't load custom backend shared library from s3 (24.07)
#7550
gerasim13
opened
3 months ago
2
tritonserver preload trt plugin got warning message and many core files : Failed to compile generated PTX with ptxas. Falling back to compilation by driver.
#7549
LinGeLin
opened
3 months ago
0
low performance at large concurrent requests
#7548
seyunchoi
opened
3 months ago
5
Encounter `Stub process is not healthy` only with kserve pod
#7547
thechaos16
closed
3 months ago
1
feat: Add vLLM counter metrics access through Triton (#7493)
#7546
mc-nv
closed
3 months ago
0
test: Add python backend tests for the new histogram metric (#7540)
#7545
mc-nv
closed
3 months ago
0
Build: Update Vllm version for 24.08
#7544
pvijayakrish
closed
3 months ago
0
[feature request] C# / .NET bindings for in-proc C-API and in-proc wrapper's C++-API
#7543
vadimkantorov
opened
3 months ago
3
Indrajit r24.08 cp
#7542
indrajit96
closed
3 months ago
1
Inconsistent prediction results using onnx backend with tensorrt enabled
#7541
fangpings
opened
3 months ago
0
test: Add python backend tests for the new histogram metric
#7540
yinggeh
closed
3 months ago
2
Build: Upgrading vLLM version for 24.08 release
#7539
pvijayakrish
closed
3 months ago
0
Build: Upgrade vLLM version for 24.08 release
#7538
pvijayakrish
closed
3 months ago
0
docs: Load new model version should not reload loaded existing model version(s)
#7537
kthui
closed
3 months ago
1
build: RHEL8 EA2 Backends
#7535
fpetrini15
closed
3 months ago
2
Discrepancy in Inference Timing between trtexec and Triton Server(TensorRT backend) with gRPC Communication for YOLOV8
#7533
twotwoiscute
closed
3 months ago
1
Support request cancellation on timeout for sync grpc client
#7532
ShuaiShao93
opened
3 months ago
0
Failed to stat file model.onxx while using conda-pack in configs
#7531
Spectra456
opened
3 months ago
1
Support passing variables in config.pbtxt
#7530
riZZZhik
opened
3 months ago
0
docs: Triton TRT-LLM user guide
#7529
krishung5
closed
3 months ago
0
vllm backend - UNAVAILABLE: Internal: ModuleNotFoundError: No module named 'numpy'
#7528
dhanushSB96
closed
3 months ago
1
test: Load new model version should not reload loaded existing model version(s)
#7527
kthui
closed
3 months ago
0
How to send the byte or string data in array in perf analyzer
#7526
Kanupriyagoyal
opened
3 months ago
3
test: Test histogram metric
#7525
yinggeh
closed
3 months ago
0
build: RHEL8 PyTorch Backend
#7524
fpetrini15
closed
3 months ago
0
ValidateBytesInputs() check failed in Big Endian Machines
#7523
Hemaprasannakc
opened
3 months ago
2
CI/Build: Pre-Release Changes for 24.08
#7522
pvijayakrish
closed
3 months ago
1
24.08 Changes
#7521
pvijayakrish
closed
3 months ago
0
How to use StopStream when use AsyncStreamInfer?
#7520
tricky61
opened
3 months ago
0
build: RHEL 8 Compatibility
#7519
nv-kmcgill53
closed
3 months ago
0
triton need api docs like vllm fastapi docs
#7518
kinglion811
opened
3 months ago
1
Stateful decoupled bls model: malloc_consolidate(): unaligned fastbin chunk detected
#7517
007durgesh219
opened
3 months ago
0
High GPU memory use
#7516
cile98
opened
3 months ago
0
SSLEOFError when result from async_infer is not available in http client
#7515
briedel
opened
3 months ago
0
Docker build of Triton Server r24.07 on Ubuntu 22.04/Arm fails
#7513
goetzrieger
opened
3 months ago
6
Previous
Next