issues
search
triton-inference-server
/
tensorrtllm_backend
The Triton TensorRT-LLM Backend
Apache License 2.0
578
stars
80
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Accumulation of tokens while beam_width > 1
#513
wxsms
opened
9 hours ago
0
Update TensorRT-LLM backend
#512
kaiyux
closed
1 day ago
0
Exception when disabling "inflight_fused_batching"
#511
TheCodeWrangler
opened
5 days ago
0
How to solve the problem of errors when loading qwen1.5-7B (using two GPUs) and llama3-8B (using two GPUs) models simultaneously using tritonserver?
#510
ChengShuting
opened
5 days ago
3
3rd Tritonserver fails to respond
#509
njaramish
opened
6 days ago
0
Assertion failed: Invalid tensor name: decoder_input_lengths
#508
HowardChenRV
opened
6 days ago
0
how to deploy multimodal model like llava on triton server? tensorrtllm_backend does not support multimodal
#507
lss15151161
closed
2 days ago
1
Key 'lora_config' not found
#506
LanceB57
opened
6 days ago
0
how to set `ignore_eos` when benchmark TensorRT LLM
#505
zhyncs
opened
1 week ago
1
Update TensorRT-LLM backend
#504
Shixiaowei02
closed
1 week ago
0
No Text Output when hosted in Triton server
#503
Adevils
opened
1 week ago
0
"error":"Unable to parse 'data': Shape does not match true shape of 'data' field"
#502
ljm565
closed
6 days ago
1
ailed to read text proto from tensorrtllm_backend/triton_model_repo/tensorrt_llm/config.pbtxt
#501
alokkrsahu
opened
1 week ago
1
UNAVAILABLE: Internal: unexpected error when creating modelInstanceState: [json.exception.parse_error.101] parse error at line 1, column 1: syntax error while parsing value - unexpected end of input; expected '[', '{', or a literal
#500
Naphat-Khoprasertthaworn
closed
1 day ago
1
Multiple outputs in sampling
#499
tonylek
opened
1 week ago
1
speculative decoding performance
#498
biaochen
opened
1 week ago
1
Do Triton Containers Support CentOS 7?
#497
XiaoYu2022
opened
1 week ago
0
LTS for this Repo with NIMs on the horizon
#496
IAINATDBI
closed
2 weeks ago
1
How to Replicate and Serve Multiple Instances of a Model?
#495
KimMinSang96
closed
1 week ago
1
Update TensorRT-LLM backend
#494
kaiyux
closed
2 weeks ago
0
Deepseek model streaming mode with Chinese character �?
#493
activezhao
opened
2 weeks ago
10
TensorRT-LLM backend v0.10 update
#492
kaiyux
closed
3 weeks ago
0
Update TensorRT-LLM backend
#491
kaiyux
closed
3 weeks ago
0
Track input and output token count for every request
#490
brarj413
closed
2 weeks ago
1
2x docker image size increase for trtllm: from 8.38 GB (24.03) to 18.46 GB (24.04)
#489
lopuhin
opened
3 weeks ago
3
Error in streaming mode noting that execute function should return None
#488
kisseternity
closed
3 weeks ago
2
Got repeated answer while deploying LLaMA3-Instruct-8B model in triton server
#487
AndyZZt
closed
3 weeks ago
2
[Bug] Output generation does not stop at stop token </s>
#486
Hao-YunDeng
closed
3 weeks ago
2
Grafana Dashboard (Feature Request)
#485
hestabit-dev
opened
3 weeks ago
0
Fixed README.md for broken links
#482
buvnswrn
closed
2 days ago
1
Update TensorRT-LLM backend
#480
kaiyux
closed
4 weeks ago
0
missing nv_trt_llm_request_metrics from python backend
#479
Hao-YunDeng
opened
4 weeks ago
0
[Docs] Fixed inference-request.md dead link
#478
DefTruth
closed
2 days ago
1
How to deploy Triton Inference Server Container (tritonserver:24.04-trtllm-python-py3) in K8S without launching Triton Server directly?
#477
Ryan-ZL-Lin
closed
4 weeks ago
0
No 24.05-trtllm-python-py3 in NGC Repo
#476
avianion
closed
3 weeks ago
1
Triton Launches Model on Incorrect GPU
#481
ethan-digi
closed
3 weeks ago
6
[Question] Best practises to track inputs and predictions?
#475
FernandoDorado
opened
1 month ago
2
Cannot build Docker container on Grace Hopper
#474
yanncaniouoracle
closed
1 month ago
2
Speculative decoding Assertion failed: Number of draft tokens (56) is larger than maximum number of draft tokens (0
#473
avianion
closed
1 month ago
0
tensorrt_llm_bls disregards top_k / temperature setting
#472
janpetrov
opened
1 month ago
0
Other backends are missing
#471
Godlovecui
closed
1 month ago
0
[Bugfix] Launch Triton server without waiting for a signal
#470
michaelnny
closed
2 weeks ago
2
v0.9.0 tensorrt_llm_bls model return error: Model '${tensorrt_llm_model_name}' is not ready.
#469
plt12138
closed
1 month ago
3
`random_seed` seems to be ignored (or at least inconsistent) for inflight_batcher_llm
#468
dyoshida-continua
opened
1 month ago
4
unexpected error when creating modelInstanceState: [json.exception.out_of_range.403] key 'name' not found
#467
Godlovecui
opened
1 month ago
4
Update TensorRT-LLM backend
#466
kaiyux
closed
1 month ago
0
[request] Add example of custom LLM model not based on huggingface
#465
michaelnny
closed
1 month ago
0
[Bug] Zero temperature curl request affects non-zero temperature requests
#464
Hao-YunDeng
closed
1 day ago
5
Can you provide an example of a visual language model or multimodal model launch by triton server?
#463
lzcchl
opened
1 month ago
6
How to deploy one model instance across multiple GPUs to tackle the OOM problem?
#462
shil3754
opened
1 month ago
6
Next