brainlid / langchain

Elixir implementation of a LangChain style framework.
https://hexdocs.pm/langchain/
Other
505 stars 58 forks source link

Add support for Google AI / Gemini Pro model #59

Closed jadengis closed 5 months ago

jadengis commented 6 months ago

Summary

This PR adds a ChatGoogleAI model that wraps interactions with the Google AI Rest APIs for the purposes of integrating with langchain, thus closing #6.

This change supports the full set of Gemini Pro features, including non-streamed responses, streamed responses and function calling.

Details

Differences with OpenAI

Quirks

arbaaz commented 6 months ago

Nice!!

brainlid commented 6 months ago

@jadengis from what I've seen in the code so far, it appears that the Google AI server only returns a single version of the assistant's message. For instance, the OpenAI API has the n parameter for the number of output versions the server should generate. Think like running the same step in the conversation with different seeds and seeing how differently it generates multiple versions.

Anyway, I've never seen another model do that and it complicates the return types. From what you've seen of the Google API, does it have that capability?

I'm considering removing support for that and cleaning up the return type for ChatOpenAI.call

  @type call_response :: {:ok, Message.t() | [Message.t()]} | {:error, String.t()}

It would just be {:ok, Message.t()} instead of an optional array of messages.

What are your thoughts?

jadengis commented 6 months ago

@brainlid I think Google AI API actually does support returning multiple versions of the message. The response JSON for the generation method API contained in the docs contains a candidates array which I believe contains should contain all the versions that would be generated. By default it seems to generate only 1 version.

The option for setting this seems undocumented however. It doesn't appear in the model parameters documation, but I did find a candidatesCount option in the official JavaScript SDK, so I think it should work. That is, sending a request like

{
  "contents": [
    "parts": [{ "text": "User message"}]
  ] ,
  "generationConfig": {
    "candidatesCount": 2
  }
}

should return 2 candidates in the response.

I personally don't have a use case for returning multiple versions of a message, but in the interest of keeping things flexible, and since there are two big LLMs that support it, it probably makes sense to leave the type as is. I can think of use cases where being able to generate multiple candidates would be useful

jadengis commented 5 months ago

@brainlid Hey is there anyway I can help to get this PR into main? Willing to pitch in if there is any preliminary work required. :pray:

medoror commented 5 months ago

@brainlid interested on your thoughts here! I have been looking into integrating ollama chat and if this PR is merged, it creates a nice seam for me to start.

Do you have any hesitations on implementation?

brainlid commented 5 months ago

@medoror: Have you used this PR? Have you tested with it any level?

brainlid commented 5 months ago

@jadengis I'm not currently setup for testing/verifying the Google endpoint. However, if you're able to help support/fix issues with the integration, then I'm okay to merge it in.

jadengis commented 5 months ago

@brainlid I'm using the Google engine currently in an application, so I've got no problem supporting / fixing issues with the integration. I'll more likely than not need those fixes anyway. As written, it's been working in production without issue for 3 - 4 weeks.

There are a few merge conflicts it looks like. Are there any big changes I should pay attention to in resolving these conflicts? :pray:

brainlid commented 5 months ago

@jadengis Sounds good! The merge conflicts should be pretty clean. Can you merge them into your branch?

jadengis commented 5 months ago

@brainlid I've updated the PR to be inline with the current main. Updated the Google model to use the same into Req trick that you added to OpenAI for streaming. Commoned out the chunk processing code. That's pretty much all the changes. Let me know if it looks good to you :pray:

brainlid commented 5 months ago

@jadengis Thanks for all the work you've put into this!

❤️💛💙💜