patterns-ai-core / langchainrb

Build LLM-powered applications in Ruby
https://rubydoc.info/gems/langchainrb
MIT License
1.41k stars 195 forks source link

OpenAI can't get default_dimensions, when I specify a special embedding_model. #598

Closed lukefan closed 6 months ago

lukefan commented 6 months ago

Many services are now compatible with the OpenAI API, including Ollama. So when I use OpenAI as a large model, I hope to specify a special embedding_model. You can refer to the method used in Ollama and directly do a simple embedding to confirm the length of the vector output by the current embedding model. just like: lib/langchain/llm/ollama.rb

andreibondarev commented 6 months ago

@lukefan Could you please show me the code and the error you're seeing?

lukefan commented 6 months ago

config/initializers/langchainrb_rails.rb:

LangchainrbRails.configure do |config|
  config.vectorsearch = Langchain::Vectorsearch::Pgvector.new(
    llm: Langchain::LLM::OpenAI.new(
      api_key: 'ollama',
      llm_options: {uri_base:'http://localhost:11434'},
      default_options: {
        embeddings_model_name: 'chevalblanc/dmeta-embedding-zh:latest',
        n: 1,
        temperature: 0.8,
        chat_completion_model_name: 'Llama3-8B-Chinese-Chat:latest',
        },
      )
  )
end

so I got this error: key not found: "chevalblanc/dmeta-embedding-zh:latest"

andreibondarev commented 6 months ago

@lukefan I don't see this model on Ollama: https://ollama.com/search?q=chevalblanc&p=1

andreibondarev commented 6 months ago

But if you wanted to use chevalblanc/embedding model for example. Make sure you've pulled the model down:

ollama pull chevalblanc/embedding
pulling manifest
pulling 7c43e3a2e21a... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 651 MB
pulling 4964a5df96b1... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  260 B
verifying sha256 digest
writing manifest
removing any unused layers
success

and then use the Ollama LLM:

llm = Langchain::LLM::Ollama.new(url: ENV["OLLAMA_URL"], default_options: { embeddings_model_name: "chevalblanc/embedding"})

llm.embed text: "..."
=>
#<Langchain::LLM::OllamaResponse:0x00000001253740d0
 @model="chevalblanc/embedding",
 @prompt_tokens=nil,
 @raw_response=
  {"embedding"=>
    [-0.40095722675323486,
     -0.019067473709583282,
     -0.2779462933540344,
     ...
lukefan commented 6 months ago

https://ollama.com/milkey/dmeta-embedding-zh they change url. There are many services that are compatible with the OpenAI API, such as together.ai or groq.com, I just used ollama as an example.

lib/langchain/llm/ollama.rb The situation where embedding_model is not found in EMBEDDING_SIZES is handled. This is hard-coded in OpenAI.

andreibondarev commented 6 months ago

@lukefan Then...

ollama pull milkey/dmeta-embedding-zh:f16 # of :f32
llm = Langchain::LLM::Ollama.new(url: ENV["OLLAMA_URL"], default_options: { embeddings_model_name: "milkey/dmeta-embedding-zh:f16"})
lukefan commented 6 months ago

I want to be able to use OpenAI to call various compatible services, not just ollama.

andreibondarev commented 6 months ago

@lukefan This library is built a bit differently though. While I understand that some other LLM providers created similar interfaces to OpenAI, not all of them have the same interface: Google Gemini, Anthropic, Cohere for example.

lukefan commented 6 months ago

I don't expect to use OpenAI's API to call all services. But I will try to choose services that are compatible with OpenAI. So I hope that OpenAI's support can consider compatibility more. Doing so can also make your project compatible with more service platforms.

lukefan commented 6 months ago

@andreibondarev Sorry. There is a problem with my test. ollama does not implement embeddings compatible with OpenAI. However, Together does implement this part of the functionality. I should use the embeddings of together.xyz for testing.

andreibondarev commented 6 months ago

@lukefan Right, the embeddings are different. Should we close this issue now?

lukefan commented 6 months ago

@andreibondarev

 llm = Langchain::LLM::OpenAI.new(
      api_key: my_together_key,
      llm_options: {uri_base: 'https://api.together.xyz'},
      default_options: {
        n: 1,
        temperature: 1,
        chat_completion_model_name: "Qwen/Qwen1.5-72B",
        embeddings_model_name: "WhereIsAI/UAE-Large-V1"
      }  
    )

this code can run.

llm.chat(messages: [{role: "user", content: "What is the meaning of life?"}]).completion

this code will raise error message

llm.embed(text: "foo bar").embedding

error message: ~/.rvm/gems/ruby-3.1.4/gems/langchainrb-0.11.4/lib/langchain/utils/token_length/openai_validator.rb:73:in token_length': undefined methodencode' for nil:NilClass (NoMethodError)

      encoder.encode(text).length
             ^^^^^^^
andreibondarev commented 6 months ago

@lukefan I think we can probably remove this max token validation: https://github.com/patterns-ai-core/langchainrb/blob/main/lib/langchain/llm/openai.rb#L76

lukefan commented 6 months ago

Yes, this should solve the problem better.