-
**Describe the bug**
Resource estimation for vLLM backend is incorrect and ignores quantization.
**Steps to reproduce**
1. In a GPU server with 4 L20 (48G VRAM) cards without any model deploy…
-
https://www.kaggle.com/rtlmhjbn/ip02-dataset
-
**What would you like to be added**:
`model.Deployment.version` is deprecated and should be removed.
ref: https://github.com/pipe-cd/pipecd/blob/master/pkg/model/deployment.proto#L86-L87
**…
-
### Issues Policy acknowledgement
- [X] I have read and agree to submit bug reports in accordance with the [issues policy](https://www.github.com/mlflow/mlflow/blob/master/ISSUE_POLICY.md)
### Where…
-
I've been working with the emotion2vec model and trying to convert it to ONNX format for deployment purposes. The current implementation is great for PyTorch users, but having ONNX support would enabl…
-
Hi,
I am not logging any issue here but have a query [vague though]. I am deploying a model/adapter which is trained using unsloth. Is there a way out of missing out on the GPU requirement? or in oth…
-
Add additional deployment models to the spec, such as aggregators, to address scalability.
-
**What is Needed**
We’ve identified the best regression model for our project, created by @Soumyadeep-Basak. Now, we need contributors to:
1. Run the workflow in the [notebook](https://github.com/…
-
# TensorRT Model Optimizer - Product Roadmap
[TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer) (ModelOpt)’s north star is to be the best-in-class model optimization toolki…
-
### Requested feature
First of all, congrats on the amazing work !
I have two improvement ideas that might help simplify using this library in a wider range production workloads:
* Supp…