-
### Feature details
**Benefits of Deploying and Serving Hybrid Model using REST API:**
* End User would be able to get model results on any device, even inside Android/IOS applications if required. …
-
### MLRun Version checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the latest version of the MLRun Kit.
### Reproducible …
-
Update:
* Please see #6801 for major items in performance sprint.
* Please see #8779 for major items in a new architecture aim at simplicity and performance.
* We are in the feedback gathering pha…
-
/kind bug
**What steps did you take and what happened:**
I'm installing Serverless KServe according to guide https://kserve.github.io/website/0.11/admin/serverless/serverless/
All services/pods a…
dsgli updated
6 months ago
-
I'm trying to convert this model to a tensorflow.js format so I can perform in-browser client-side inference of the model. To do this, the model needs to be in the SavedModel format as specified [here…
-
# Outline
This post will elaborate on how to effectively use Machine Learning infrastructure. Namely, the best practice in building, maintaining and scaling production-ready deep learning systems.
…
-
Hi,
I did tried to implement SavedModel in test.py, couldn't fix the issue.
sess = tf.Session()
# folder to export SavedModel
SavedModel_folder = "SavedModel"
# remove all…
-
is it planned to support MlFlow as LLM-Provider ?
-
**Is your feature request related to a problem? Please describe.**
Without 4bit quantization the batch size is limited
**Describe the solution you'd like**
Add AWQ support, just like TGI
**Des…
-
Hello Vetal1977
Wondering whether you attempted doing batching with tensorflow serving. Any help is appreciated if you could guide me in how to do.
I tried using https://stackoverflow.com/questions…