ray-project / ray-llm

RayLLM - LLMs on Ray
https://aviary.anyscale.com
Apache License 2.0
1.22k stars 87 forks source link

Embedding model support in ray-llm #57

Closed YQ-Wang closed 11 months ago

YQ-Wang commented 11 months ago

In Ray summit, @pcmoritz talked about the embedding models (especially the GTE-base) in the session Developing and Serving RAG-Based LLM Applications in Production. It would be great if we could also have a model config in models/continuous_batching for this CPU model so the developer can host all the models relevant to the RAG in ray-llm.

YQ-Wang commented 11 months ago

Hosted the embedding model directly using Ray Serve instead.