triton-inference-server pytriton issues

triton-inference-server / pytriton

PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.

https://triton-inference-server.github.io/pytriton/

Apache License 2.0

684 stars 45 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[Question] Tensor parallelism for tensorrt_llm

#79 JoeLiu996 opened 1 day ago
0
disallow use of numpy 2 for now

#78 catwell closed 1 week ago
5
Model is not initialized to GPU.

#77 jaehyeong-bespin opened 1 week ago
0
[Error following Quick Start] - tritonclient.utils.InferenceServerException: [400] client received an empty response from the server.

#76 sophot closed 2 weeks ago
1
multi-gpu inference with pytriton got worse TPS

#75 lionsheep24 opened 2 weeks ago
2
[Question] About the subprocess for multi-instance

#74 leafjungle closed 3 weeks ago
4
[Question] will the server fork several subprocess when infer_func is a list?

#73 leafjungle closed 1 month ago
1
[HELP] what is the problem for this demo code?

#72 leafjungle closed 1 month ago
0
PyTriton produces DEBUG output by default?

#71 JanFSchulte closed 1 month ago
1
[Question] What is the relationship between "model_repository" and "infer_func"?

#70 leafjungle closed 1 month ago
0
Model instances question

#69 tinsss closed 1 month ago
3
Is there a way to run pytriton on glibc2.32?

#68 DZ9 closed 3 weeks ago
9
double container in kubernetes

#67 leafjungle closed 1 month ago
1
[Bug] Fail to deploy serving model on the Azure Machine Learning Platform. Exited with failure (confusing error information and exit code)

#66 keli-wen closed 1 month ago
3
Example of TensorRT-LLM Whisper backend for PyTriton

#65 aleksandr-smechov opened 3 months ago
5
fix: Remove duplicated paragraph

#64 getty708 closed 3 months ago
3
Python InferenceServerClient issue when call close() from __del__

#63 lionsheep0724 closed 4 months ago
7
Put `pytriton.client` in the separate package/wheel.

#62 flyingleafe opened 5 months ago
3
pytriton use onnx is slower than onnx runtime for tiny bert model

#61 yan123456jie opened 5 months ago
1
how to define a new api andinput like flask

#60 Pobby321 closed 4 months ago
3
nav.optimize() bug

#59 Pobby321 closed 4 months ago
7
Questions about new feature at 0.5.0 : decoupled model

#58 lionsheep0724 closed 4 months ago
4
onnx and tensorrt model supported?

#57 oreo-lp closed 4 months ago
3
ModuleNotFoundError: No module named '_ctypes' error when run pytriton server with 0.5.0

#56 lionsheep0724 closed 4 months ago
5
The content of this document is wrong

#55 HJH0924 closed 5 months ago
2
The content of this document is incorrect

#54 HJH0924 closed 5 months ago
2
What is the proxy backend in pytriton?

#53 HJH0924 closed 4 months ago
4
pytriton is slower than triton

#52 yan123456jie closed 5 months ago
6
AttributeError: '_thread.RLock' object has no attribute '_recursion_count'

#51 dogky123 closed 4 months ago
4
How to infer with sequence ?

#50 monsterlyg opened 6 months ago
3
fix boot when allow_http=False

#49 catwell closed 6 months ago
3
[problem]How to allowed multiple models running on same GPU at same time?

#48 Firefly-Dance closed 5 months ago
5
while inference by running server.py and client.py why client is taking gpu memory.

#47 Justsubh01 closed 7 months ago
0
Enabling Redis cache throws: Unable to find shared library libtritonserver.so

#46 zbloss closed 6 months ago
5
Error deploying model on Vertex AI

#45 sricke closed 3 months ago
16
Support Mac installation

#44 zbloss opened 7 months ago
16
Streaming and batching

#43 giuseppe915 closed 5 months ago
6
How to pass priority level during inference?

#42 jackielam918 opened 7 months ago
3
TensorRT-LLM suport?

#41 LouisCastricato closed 3 months ago
4
Pytriton don't nativly support pytorch or tensorflow dtype

#40 dahai331 closed 7 months ago
3
tritonclient.grpc doesn't support timeout for other commands than infer.

#39 dogky123 closed 5 months ago
4
Client network and/or connection timeout is smaller than requested timeout_s. This may cause unexpected behavior.

#38 lfxx closed 8 months ago
2
Stub process 'REGIS_0' is not healthy

#37 lfxx closed 8 months ago
2
Support for ubuntu20.04

#36 lfxx closed 8 months ago
3
OUTPUT triton: list or tuple or any kind of Iterables

#35 dogky123 closed 8 months ago
4
Binary output truncated...

#34 rilango closed 8 months ago
4
Update megatron example so that it would support latest changes in NeMo

#33 PeganovAnton closed 9 months ago
2
Best practices with ModelClient

#32 markbarna closed 10 months ago
4
Example of (or support for) Inference Callable of Triton ensemble definition

#31 michaelhagel closed 9 months ago
11
How to check if the server-side service is still online through the client-side?

#30 lfxx closed 11 months ago
2