-
**Is your feature request related to a problem? Please describe.**
I would like to deploy the same model with different configurations. In my production environment I have different kinds of machin…
-
**Is your feature request related to a problem? Please describe.**
This is a feature related to how to deploy a model with LoRA supported.
**Describe the solution you'd like**
I have a UNet model…
-
**Description**
Using `-Werror` causes unnecessary failures when trying to build with a more recent compiler.
**Triton Information**
6a1b5b58910cb64f9f8537f14d0691325da0530c
Container built myse…
-
**Description**
I'm trying multiple `config.pbtxt` for serving models with local CPUs. Every other options work well, but I've found that `max_batch_size` does not work as expected.
What I expect …
-
cfr. the internal [Acid800](https://www.virtualdub.org/downloads/Acid800-1.1.7z) README.html for hints by Avery Lee (Altirra emu author) about what each test expects
## CPU
- [x] Basic instruction…
-
Hello! I apologize if this is a topic that has already been covered here (I looked in the closed issues and didn't find exactly what I want), but I would like to know what is the fastest and most effi…
-
Hi Maintainers,
Thanks for the great work. This is not a bug report probably, but more of a question.
I am following https://github.com/triton-inference-server/backend/tree/main/examples#custom…
-
**Is your feature request related to a problem? Please describe.**
We are trying to support larger batches for Triton server (larger than max_batch_size), leveraging instance groups and splitting the…
omidb updated
4 months ago
-
**Description**
Could not load model using mlflow with minIO as model repository. I have tried this AWS S3 bucket and it worked as expected. have followed this article [MLflow Triton Plugin](https://…
-
The way it is described here, I do not think this is the right way to approach this and I am very concerned that there is not enough data captured. Why are we at a stage where this is being defined in…