Closed Hegghammer closed 10 months ago
Great idea! Thank you, this is an excellent addition.
While testing this I noticed that this can be achieved with Xinference
alone, without the need for LiteLLM
, which makes the process more streamlined.
I'll make this minor adjustment and then merge.
Some users might want to use a custom offline embedding model, for example to get multilanguage support. These instructions show one way of doing it.