Open jordantgh opened 1 year ago
Also part of the reason I was thinking about this is it would possibly make it easier to implement #2.
LiteLLM looks great. The manual abstraction is definitely annoying, and that looks like a better solution. I also tested some Llama base models on Replicate that I'm going to push some of the results later today, and it looks like LiteLLM supports Replicate as well.
Nice, I'll probably be able to look at submitting a PR in the coming weekend.
Hi i'm the maintainer of LiteLLM - happy to make the PR too
Hi i'm the maintainer of LiteLLM - happy to make the PR too
It certainly won't bother me haha, and I'd guess it'll take you about 5% of the time it would take me 😁
on it, would you be open to hopping on a quick call (10 mins) ?
I'd love to understand how LiteLLM can be better for you + chess_gpt_eval
sharing my calendly here for your convenience: https://calendly.com/ishaan-berri/30min?month=2023-09
on it, would you be open to hopping on a quick call (10 mins) ?
I'd love to understand how LiteLLM can be better for you + chess_gpt_eval
sharing my calendly here for your convenience: https://calendly.com/ishaan-berri/30min?month=2023-09
Not the maintainer of this repo btw, although we did happen to chat on a repo of mine 😁 In my case I don't have much to say beyond what I wrote already.
@adamkarvonen I haven't had time this weekend to do anything on this. I took a look at the code and it is hard for me to untangle. If you want to discuss we can, otherwise it'll take longer.
Sure. To replace my manual abstraction with LiteLLM, you should be able to replace these lines: https://github.com/adamkarvonen/chess_gpt_eval/blob/local_llama/gpt_query.py#L68-L79
with a call to LiteLLM. Looking above, it looks like that could be something like response = completion(model="gpt-4", messages=message_input)
The messages
list is currently set as a list of OpenAI style input dicts, where the key can be 'system, 'user', or 'assistant'. Because there isn't a back and forth with the model, this is somewhat unnecessary, I was just reusing some code from an earlier project.
Base models and completions models don't need a system message, but chat models like Llama2-chat or GPT-4 need this system message to get the proper output format: https://github.com/adamkarvonen/chess_gpt_eval/blob/local_llama/gpt_query.py#L23
Does that help? I can take a look at this later as well, currently I'm trying to add support for the nanoGPT repository. I'll document my changes and clean up the code once Llama and nanoGPT support is added and functioning.
I'd be willing to work on this but would like your opinion before submitting any PR.
From a quick scan of your
gpt_query.py
, it looks like you're doing more work than you need to support different APIs. LiteLLM provides a nice abstraction layer (I guess there may be others out there). You would just writeresponse = completion(model="gpt-4", messages=message_input)
for OAI orresponse = completion(model="openrouter/openai/gpt-4", messages=message_input)
for Openrouter, etc (Azure, Anthropic, Cohere and HF models also supported) and set appropriate environment variables for the chosen model. We could provide a .env_example with all the API keys that the user can copy.