Support for Google Gemini OOTB

mhebrard-bigid commented 6 months ago

LiteLLM already support Gemini so it's probably already doable. Would be nice to support it OOTB as Gemini has a large context window

zarlor commented 6 months ago

It is doable now, but I find I have to add an extra step to it, after you set up the custom LLM for Gemini, and as long as the docker containers are running, you then have to run the following two commands (assuming you're using docker for this):

docker exec -it danswer-stack-api_server-1 pip install -q google-generativeai docker exec -it danswer-stack-background-1 pip install -q google-generativeai

You'll have to run them everytime you restart Danswer. If you're running from source then you can just run the "pip install googe-generativeai" command locally and you should be good to go. But agreed, it would be nice to not have to do that and just have Gemini "work" right out of box. I've been using Gemini 1.5 pro mostly, lately, and it's does seem to do a pretty decent job (and it doesn't hurt that it's free for the moment! :D)

gmoulard commented 5 months ago

I can run this 2 command. I don't have error message. But I don't know how to setup and use Gemini. I don't view any changes on admin interface.

docker exec -it danswer-stack-api_server-1 pip install -q google-generativeai WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

docker exec -it danswer-stack-backgdocker exec -it danswer-stack-background-1 pip install -q google-generativeai WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

zarlor commented 5 months ago

I can run this 2 command. I don't have error message. But I don't know how to setup and use Gemini. I don't view any changes on admin interface.

docker exec -it danswer-stack-api_server-1 pip install -q google-generativeai WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

docker exec -it danswer-stack-backgdocker exec -it danswer-stack-background-1 pip install -q google-generativeai WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

So that warning isn't a problem, it's just indicating there is a newer version of pip available within those containers. You didn't get any real errors so you are all set for Gemini to work. However, did you configure a Gemini LLM in Danswer? You still need to go to the Admin Panel and under Model Configs select LLM. From there you need to hit the button at the bottom for "Add Custom LLM Provider" because Gemini is not configured by default as an available LLM, that's why you wouldn't see any changes. Fill in Display name of gemini, Provider is also gemini, put in your API key in the field for that. Then all the way down under Model Names use whichever (or all) of the Model Names listed near the bottom of https://docs.litellm.ai/docs/providers/gemini. Also use one of those names in the Default Model Name field. You don't really need anything more than that so just hit "Test" to verify it's all good so you can save it.

mhebrard-bigid commented 5 months ago

Works nicely with @zarlor instructions. I'm not sure danswer is optimized for leveraging the 1M token window. The best I could configure was to increase the number of chunks to 20 instead of 10 when configuring a new assistant. So at most we use 8K tokens for context injection.

zarlor commented 5 months ago

I did the same, but haven't seen any issues on the sending front in terms of context. I have seen what seem to be limitations on the receiving side, though. I was doing a small project here trying to see what Danswer might be able to do for creating code for a Discord connector for Danswer and when responding with larger responses it would get cut off. There's probably someplace to change the incoming and outgoing context windows but I'm not sure where or what that would be, personally.

It also doesn't (the last time I checked) accept uploads of things like images within Danswer where it was just saying this isn't whatever it's called for being an image accepting/processing model, even though it is (I guess it just assumes any custom LLM can't handle it). So there are definitely some extra limitations for now but I don't find them too horrible. Even with the code thing I was able to tell Gemini to do things like no include any extra explanatory text, just the code, to get it all, or something like "I received up to the last two lines that say 'blah-blah'. Was that the end of your response and, if not, would you please send the rest of the response starting with those two lines" as well as things like copy/paste sending the entire code snippets back for verification (which, come to think of it, was a decently large context and Danswer didn't complain about sending it and it seemed like Gemini got the whole thing, so...)

danswer-ai / danswer

Support for Google Gemini OOTB #1405