[FEAT] Define model for Embedding

getzep / zep

Zep | The Memory Foundation For Your AI Stack

https://help.getzep.com/ce

Apache License 2.0

2.62k stars 372 forks source link

[FEAT] Define model for Embedding #276

Open netandreus opened 11 months ago

netandreus commented 11 months ago

Problem

I am using LocalAI with Zep.

llm:
  service: "openai"
  model: "gpt-3.5-turbo-1106"
  openai_endpoint: "http://host.docker.internal:8080/v1"

I can define model for llm itself, but It's needed also to define model for embeddings, because It seems that now model is hardcoded to text-embedding-ada-002.

Possible solution

Add and use model key in embeddings options like this:

    embeddings:
      enabled: true
      chunk_size: 200
      dimensions: 384
      service: "openai"
      model: "some-custom-model"

danielchalef commented 9 months ago

We're refactoring how LLMs work and separating generation/completion from embeddings, which will address the above. We'll be releasing this in the new year.

danielchalef commented 2 weeks ago

Other inference providers are now supported via a proxy such as LiteLLM.

t41372 commented 18 hours ago

Would it be possible to allow users to define the model name they'd like to use, rather than being limited to "gpt-4o-mini" and OpenAI's embeddings, without the need for a proxy server like LiteLLM? Many inference backends, such as Ollama, are compatible with OpenAI's chat and embeddings endpoints. However, the inability to modify the model name prevents us from utilizing this compatibility.

While, as you mentioned, we could use LiteLLM as a proxy server to reroute requests and overwrite the model name, this approach adds significant complexity. I'm currently exploring long-term memory integration for my project, Open-LLM-VTuber, and setting up LiteLLM with rerouting for LLM and embeddings can present a serious challenge for many of my users.

Given how beneficial the ability to change the model name would be, I strongly recommend considering allowing users to set custom model name for LLM and embeddings.