[FEATURE] Support ray serve engine

stikkireddy / mlflow-extensions

Deploy models quickly to databricks via mlflow based serving infra.

https://stikkireddy.github.io/mlflow-extensions/

Apache License 2.0

19 stars 11 forks source link

[FEATURE] Support ray serve engine #35

Open stikkireddy opened 2 months ago

stikkireddy commented 2 months ago

Ray Serve is a phenomenal serving engine that abstracts serving and some throughput optimization features like batching, async execution, pipelining, etc. Supports torch and other popular frameworks. This can be used for the following models:

custom embedding models with post processors
standard embedding models
encoder decoder models like whisper
diffusion models
multi model serving

stikkireddy commented 2 months ago

some common embedding models:

Multimodal text-image embedding: clip-ViT-B-32
Hebrew embeddings: dicta-il/dictabert-joint
French embeddings: almanach/camembert-base