Closed jadengis closed 5 months ago
Nice!!
@jadengis from what I've seen in the code so far, it appears that the Google AI server only returns a single version of the assistant's message. For instance, the OpenAI API has the n
parameter for the number of output versions the server should generate. Think like running the same step in the conversation with different seeds and seeing how differently it generates multiple versions.
Anyway, I've never seen another model do that and it complicates the return types. From what you've seen of the Google API, does it have that capability?
I'm considering removing support for that and cleaning up the return type for ChatOpenAI.call
@type call_response :: {:ok, Message.t() | [Message.t()]} | {:error, String.t()}
It would just be {:ok, Message.t()}
instead of an optional array of messages.
What are your thoughts?
@brainlid I think Google AI API actually does support returning multiple versions of the message. The response JSON for the generation method API contained in the docs contains a candidates
array which I believe contains should contain all the versions that would be generated. By default it seems to generate only 1 version.
The option for setting this seems undocumented however. It doesn't appear in the model parameters documation, but I did find a candidatesCount
option in the official JavaScript SDK, so I think it should work. That is, sending a request like
{
"contents": [
"parts": [{ "text": "User message"}]
] ,
"generationConfig": {
"candidatesCount": 2
}
}
should return 2 candidates in the response.
I personally don't have a use case for returning multiple versions of a message, but in the interest of keeping things flexible, and since there are two big LLMs that support it, it probably makes sense to leave the type as is. I can think of use cases where being able to generate multiple candidates would be useful
@brainlid Hey is there anyway I can help to get this PR into main? Willing to pitch in if there is any preliminary work required. :pray:
@brainlid interested on your thoughts here! I have been looking into integrating ollama chat and if this PR is merged, it creates a nice seam for me to start.
Do you have any hesitations on implementation?
@medoror: Have you used this PR? Have you tested with it any level?
@jadengis I'm not currently setup for testing/verifying the Google endpoint. However, if you're able to help support/fix issues with the integration, then I'm okay to merge it in.
@brainlid I'm using the Google engine currently in an application, so I've got no problem supporting / fixing issues with the integration. I'll more likely than not need those fixes anyway. As written, it's been working in production without issue for 3 - 4 weeks.
There are a few merge conflicts it looks like. Are there any big changes I should pay attention to in resolving these conflicts? :pray:
@jadengis Sounds good! The merge conflicts should be pretty clean. Can you merge them into your branch?
@brainlid I've updated the PR to be inline with the current main
. Updated the Google model to use the same into
Req trick that you added to OpenAI for streaming. Commoned out the chunk processing code. That's pretty much all the changes. Let me know if it looks good to you :pray:
@jadengis Thanks for all the work you've put into this!
❤️💛💙💜
Summary
This PR adds a
ChatGoogleAI
model that wraps interactions with the Google AI Rest APIs for the purposes of integrating with langchain, thus closing #6.This change supports the full set of Gemini Pro features, including non-streamed responses, streamed responses and function calling.
Details
Differences with OpenAI
"model"
role instead of"assistant
"for_api/
behaviour using plain old pattern matching. The protocol appoach seems a little roundabout.Quirks
alt=sse
query param added to the url. This is undocumented, but i noticed it being used in the official SDKs.finishReason: "STOP"
for basically everything, including message deltas. This doesn't jive well with some of the existing logic for tracking when e.g. streaming deltas completes. This behaviour is faked in theChatGoogleAI
module.