Open sungkim11 opened 4 months ago
I think it should be easy to serve GritLM using vLLM or similar and providing access to its embedding capability / its language modeling capability or both in one single model / endpoint. But I'm not sure about the details of vllm etc.
I would like to run embedding as a service using something like vLLM on a Docker container on different host. How would one go about doing this?