-
**What would you like to be added**:
Similar to kserve https://kserve.github.io/website/latest/modelserving/v1beta1/custom/custom_model/#parallel-model-inference
**Why is this needed**:
*…
-
### Configuration
```hcl
resource "databricks_model_serving" "this" {
name = "e5_${local.STICKY_RANDOM}"
config {
served_entities {
name = "e5_small_v2"
enti…
-
Title basically says it, I have trained a model using HorovodAllToAllEmbeddings and saved by doingg:
```
de.keras.models.de_save_model(
model,
export_dir,
overwrit…
-
Ray Serve is a phenomenal serving engine that abstracts serving and some throughput optimization features like batching, async execution, pipelining, etc. Supports torch and other popular frameworks. …
-
We support lws as the default workload, however, most of the cases mutli-hosts is not needed, even with Llama3.1 405B. So maybe this is a better choice.
-
Open this issue for tracking the progress of models supported in candle-vllm.
-
As part of the ongoing development of the meal planner feature, we need to add the ability to display and edit servings and serving units for each food item. This will provide users with more detailed…
-
Im trying to deploy my setfit model in torchserver using a custom handler for this task. The thing is that im not being able to do this since im getting multiple errors while registering the model on …
-
Realize this is an orthogonal questions - but what's a simple way to stand up a llama.c model serving so I can access it from LangChain
-
### Feature request
[Kserve](https://github.com/kserve/kserve) is a Kubernetes based engine for predictive and generative AI models and provides abstraction for popular model servers like Huggingface…