brainlid / langchain

Elixir implementation of a LangChain style framework.
https://hexdocs.pm/langchain/
Other
503 stars 58 forks source link

Verify new tools api code with GoogleAI #107

Open brainlid opened 2 months ago

brainlid commented 2 months ago

The PR #105 was merged to main, and the tests pass, but I didn't run it against a GoogleAI LLM. I would appreciate help validating that.

@jadengis, this impacts your code the most. Just a heads-up.

jadengis commented 2 months ago

@brainlid sure i should be able to run main against my setup and see if it breaks anything. Are there any breaking changes to the API i'd need to integrate?

brainlid commented 2 months ago

The big change is around function calls and function results.

An assistant message can contain 0 to many ToolCalls. A Tool message contains 1 or more ToolResults.

On Sat, Apr 27, 2024 at 12:32 AM John Dengis @.***> wrote:

@brainlid https://github.com/brainlid sure i should be able to run main against my setup and see if it breaks anything. Are there any breaking changes to the API i'd need to integrate?

— Reply to this email directly, view it on GitHub https://github.com/brainlid/langchain/issues/107#issuecomment-2080386892, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGFQGAKV7OYQCSYAYZFTNLY7NBBVAVCNFSM6AAAAABG3WO4RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBQGM4DMOBZGI . You are receiving this because you were mentioned.Message ID: @.***>

nileshtrivedi commented 3 weeks ago

Looking forward to this as Gemini Pro APIs have become publicly available. This will also allow me to use this project as an abstraction for LLMs in my Elixir-port of Autogen which is a multi-agent framework from Microsoft.

Here are the Function Calling docs for Gemini models: https://ai.google.dev/gemini-api/docs/function-calling

jadengis commented 3 weeks ago

@nileshtrivedi if you are interested I would appreciate some help testing out the above change. I tried it out in my project but ran into some errors with the new code and I couldn't get it working. I haven't had time to go back an fix the issues however.

nileshtrivedi commented 3 weeks ago

@jadengis I thought of modifying the existing tools/calculator_test for Google Gemini models but even the existing test for OpenAI seems to fail for me at this line because message.content seems to be nil. This is the message object when this assertion fails:

%LangChain.Message{content: nil, processed_content: nil, index: 0, status: :complete, role: :assistant, name: nil, tool_calls: [%LangChain.Message.ToolCall{status: :complete, type: :function, call_id: "call_9Fq4ZN3U4Ln4D52Kwg1DXj0G", name: "calculator", arguments: %{"expression" => "100 + 300 - 200"}, index: nil}], tool_results: nil}

I don't know whether it's a bug in the test code itself or something else. Unable to test further. 🫤

For my own work, even Autogen itself currently seems broken for Gemini models. I ended up using Python example code using Google's own SDK.

brainlid commented 3 weeks ago

@nileshtrivedi I updated and fixed the Calculator tool and tests. Thanks for pointing that out!

https://github.com/brainlid/langchain/pull/132

brainlid commented 3 weeks ago

@jadengis please let me know what errors you're getting! We can hopefully get it ironed out quickly.

brainlid commented 3 weeks ago

For migrating, I tried to document what would be needed in the CHANGELOG. Let me know if you're finding gaps!

nileshtrivedi commented 3 weeks ago

I submitted #135 as a failing test. While mix test test/chat_models/chat_google_ai_test.exs --include live_call passes, mix test test/tools/calculator_gemini_test.exs --include live_call fails with "Unexpected response" from the LLM.

nileshtrivedi commented 2 weeks ago

I think there are multiple errors in calling Gemini model APIs properly:

There may be more issues.

EDITED: Noticed this open PR for fixing the endpoint: https://github.com/brainlid/langchain/pull/118 Seems like there are subtle difference between Gemini API and VertexAI API which are causing these.

brainlid commented 1 week ago

@nileshtrivedi We split out ChatVertexAI from ChatGoogleAI because the differences were subtle but throughout. In the published RC, the callbacks have been updated as well. Thank you for looking at it before. How does it look now?

nileshtrivedi commented 1 week ago

Gemini API still seems to fail when testing with

I tested after making this change in test/tools/calculator_test.exs:

--- a/test/tools/calculator_test.exs
+++ b/test/tools/calculator_test.exs
@@ -5,7 +5,7 @@ defmodule LangChain.Tools.CalculatorTest do
   doctest LangChain.Tools.Calculator
   alias LangChain.Tools.Calculator
   alias LangChain.Function
-  alias LangChain.ChatModels.ChatOpenAI
+  alias LangChain.ChatModels.ChatGoogleAI
   alias LangChain.Message.ToolCall
   alias LangChain.Message.ToolResult

@@ -80,7 +80,7 @@ defmodule LangChain.Tools.CalculatorTest do
         end
       }

-      model = ChatOpenAI.new!(%{seed: 0, temperature: 0, stream: false, callbacks: [llm_handler]})
+      model = ChatGoogleAI.new!(%{model: "gemini-1.5-pro", api_key: System.fetch_env!("GEMINI_API_KEY"), seed: 0, temperature: 0, stream: false, callbacks: [llm_handler]})

This is the test failure I get:

% mix test test/tools/calculator_test.exs --include live_call
Compiling 1 file (.ex)
Including tags: [:live_call]

.

  1) test live test performs repeated calls until complete with a live LLM (LangChain.Tools.CalculatorTest)
     test/tools/calculator_test.exs:68
     ** (MatchError) no match of right hand side value: {:error, %LangChain.Chains.LLMChain{llm: %LangChain.ChatModels.ChatGoogleAI{endpoint: "https://generativelanguage.googleapis.com/v1beta", api_version: "v1beta", model: "gemini-1.5-pro", api_key: "REMOVED_FOR_SAFETY", temperature: 0.0, top_p: 1.0, top_k: 1.0, receive_timeout: 60000, stream: false, callbacks: [%{on_llm_new_message: #Function<1.19352243/2 in LangChain.Tools.CalculatorTest."test live test performs repeated calls until complete with a live LLM"/1>}]}, verbose: false, verbose_deltas: false, tools: [%LangChain.Function{name: "calculator", description: "Perform basic math calculations or expressions", display_text: nil, function: #Function<0.76765322/2 in LangChain.Tools.Calculator.execute>, async: true, parameters_schema: %{type: "object", required: ["expression"], properties: %{expression: %{type: "string", description: "A simple mathematical expression"}}}, parameters: []}], _tool_map: %{"calculator" => %LangChain.Function{name: "calculator", description: "Perform basic math calculations or expressions", display_text: nil, function: #Function<0.76765322/2 in LangChain.Tools.Calculator.execute>, async: true, parameters_schema: %{type: "object", required: ["expression"], properties: %{expression: %{type: "string", description: "A simple mathematical expression"}}}, parameters: []}}, messages: [%LangChain.Message{content: "Answer the following math question: What is 100 + 300 - 200?", processed_content: nil, index: nil, status: :complete, role: :user, name: nil, tool_calls: [], tool_results: nil}], custom_context: nil, message_processors: [], max_retry_count: 3, current_failure_count: 0, delta: nil, last_message: %LangChain.Message{content: "Answer the following math question: What is 100 + 300 - 200?", processed_content: nil, index: nil, status: :complete, role: :user, name: nil, tool_calls: [], tool_results: nil}, needs_response: true, callbacks: [%{on_tool_response_created: #Function<2.19352243/2 in LangChain.Tools.CalculatorTest."test live test performs repeated calls until complete with a live LLM"/1>}]}, "Unexpected response"}
     code: {:ok, updated_chain, %Message{} = message} =
     stacktrace:
       test/tools/calculator_test.exs:85: (test)

     The following output was logged:

     14:01:18.910 [error] Trying to process an unexpected response. ""

     14:01:18.910 [error] Error during chat call. Reason: "Unexpected response"

......
Finished in 1.8 seconds (0.00s async, 1.8s sync)
8 tests, 1 failure

Randomized with seed 98474

I confirmed that my actual api_key was printed where it says REMOVED_FOR_SAFETY. I also tried model: "gemini-1.5-flash" but got the same error.

I think it might be easier if you can signup on https://ai.google.dev/ to get an API key to help with testing?

ljgago commented 5 days ago

Hello @nileshtrivedi,

If you change these lines: https://github.com/brainlid/langchain/blob/d63e11a3e926450bcb2e983ae64c135ccb52c822/lib/chat_models/chat_google_ai.ex#L27-L29

by

  @default_endpoint "https://generativelanguage.googleapis.com"
  @default_api_version "v1beta"

Do the tests pass?

nileshtrivedi commented 5 days ago

@ljgago No, it fails but with a different error:

% mix test test/tools/calculator_test.exs --include live_call              
Compiling 39 files (.ex)
Generated langchain app
Including tags: [:live_call]

....

  1) test live test performs repeated calls until complete with a live LLM (LangChain.Tools.CalculatorTest)
     test/tools/calculator_test.exs:68
     ** (LangChain.LangChainError) content: is invalid for role tool
     code: |> LLMChain.run(mode: :while_needs_response)
     stacktrace:
       (langchain 0.3.0-rc.0) lib/message.ex:408: LangChain.Message.new_tool_result!/1
       (langchain 0.3.0-rc.0) lib/chains/llm_chain.ex:692: LangChain.Chains.LLMChain.execute_tool_calls/2
       (langchain 0.3.0-rc.0) lib/chains/llm_chain.ex:321: LangChain.Chains.LLMChain.run_while_needs_response/1
       test/tools/calculator_test.exs:95: (test)

     The following output was logged:

     06:58:59.287 [debug] Executing function "calculator"

...
Finished in 2.6 seconds (0.00s async, 2.6s sync)
8 tests, 1 failure

Randomized with seed 604620

I am happy to get on a call with any devs to work this out.