Support using other/local LLMs

DataBassGit commented 1 year ago

You can modify the code to accept a config file as input, and read the Chosen_Model flag to select the appropriate AI model. Here's an example of how to achieve this:

Create a sample config file named config.ini:

[AI] Chosen_Model = gpt-4

Offload the call_ai_fuction from the ai_functions.py to a separate library. Modify the call_ai_function function to read the model from the config file:

import configparser

def call_ai_function(function, args, description, config_path="config.ini"):
    # Load the configuration file
    config = configparser.ConfigParser()
    config.read(config_path)

    # Get the chosen model from the config file
    model = config.get("AI", "Chosen_Model", fallback="gpt-4")

    # Parse args to comma separated string
    args = ", ".join(args)
    messages = [
        {
            "role": "system",
            "content": f"You are now the following python function: ```# {description}\n{function}```\n\nOnly respond with your `return` value.",
        },
        {"role": "user", "content": args},
    ]

  # Use different AI APIs based on the chosen model
    if model == "gpt-4":
        response = openai.ChatCompletion.create(
            model=model, messages=messages, temperature=0
        )
    elif model == "some_other_api":
        # Add code to call another AI API with the appropriate parameters
        response = some_other_api_call(parameters)
    else:
        raise ValueError(f"Unsupported model: {model}")

    return response.choices[0].message["content"]

In this modified version, the call_ai_function takes an additional parameter config_path which defaults to "config.ini". The function reads the config file, retrieves the Chosen_Model value, and uses it as the model for the OpenAI API call. If the Chosen_Model flag is not found in the config file, it defaults to "gpt-4".

the if/elif structure is used to call different AI APIs based on the chosen model from the configuration file. Replace some_other_api with the name of the API you'd like to use, and replace parameters with the appropriate parameters required by that API. You can extend the if/elif structure to include more AI APIs as needed.

keldenl commented 1 year ago

i made a drop-in replacement for openai via llama.cpp (had some issues with the python binding llama-cpp-python). almost got it working with autogpt, just need to fully support same-dimension embeddings, if anybody has any pointers pls lmk, i got auto-gpt working for a couple cycles but it's really inconsistent

https://github.com/keldenl/gpt-llama.cpp

update: i JUST got autogpt working with llama.cpp! see https://github.com/keldenl/gpt-llama.cpp/issues/2#issuecomment-1514353829

i'm using vicuna for embeddings and generation but it's struggling a bit to generate proper commands to not fall into a infinite loop of attempting to fix itself X( will look into this tmr but super exciting cuz i got the embeddings working! (turns out it was a bug on my end lol)

had to make some changes to autogpt (add base_url to openai_base_url, and adjust the dimensions of the vector, but otherwise left it alone)

Vitorhsantos commented 1 year ago

There is this one too: https://github.com/rhohndorf/Auto-Llama-cpp

We face the same problem here, vicuna understands the comands given via prompt differently from GPT-4, so there is need to re-work the commands.

keldenl commented 1 year ago

i wonder if there's another similar agent based package that is designed with gpt3 in mind and shorter prompts that could help with vicuna generation

MarkSchmidty commented 1 year ago

StableLM was released today. Has 4096 context window. StableLM 15B will be trained on 1.5T tokens (more than LLaMA)

InfernalDread commented 1 year ago

i made a drop-in replacement for openai via llama.cpp (had some issues with the python binding llama-cpp-python). almost got it working with autogpt, just need to fully support same-dimension embeddings, if anybody has any pointers pls lmk, i got auto-gpt working for a couple cycles but it's really inconsistent https://github.com/keldenl/gpt-llama.cpp

update: i JUST got autogpt working with llama.cpp! see keldenl/gpt-llama.cpp#2 (comment)

i'm using vicuna for embeddings and generation but it's struggling a bit to generate proper commands to not fall into a infinite loop of attempting to fix itself X( will look into this tmr but super exciting cuz i got the embeddings working! (turns out it was a bug on my end lol)

had to make some changes to autogpt (add base_url to openai_base_url, and adjust the dimensions of the vector, but otherwise left it alone)

Will you be sharing your fork of AutoGPT so that we can test it out? Thank you by the way for all the work you have done!

keldenl commented 1 year ago

@InfernalDread i'll do it after work today! will post in the issue in the repo but i'll post it here too

InfernalDread commented 1 year ago

@keldenl Fantastic! Once again, thank you for all your efforts in making this dream a reality!

DGdev91 commented 1 year ago

Thanks to @keldenl for his work!

I made a pull request for the changes mentioned by him #2594

keldenl commented 1 year ago

awesome stuff @DGdev91 , i'll write up a quick guide tonight on using that fork + gpt-llama.cpp

keldenl commented 1 year ago

✨ FULL GUIDE POSTED on how to get Auto-GPT running with llama.cpp via gpt-llama.cpp in https://github.com/keldenl/gpt-llama.cpp/issues/2#issuecomment-1515738173.

Huge shoutout to @DGdev91 for the PR and hope it gets merged soon!

keldenl commented 1 year ago

Quick follow-up, although gpt-llama.cpp worked with auto-gpt at the time of posting the guide, there were various bugs that plague it from working consistently. I've went ahead and fixed all the ones i could find in yesterday and today, and it's running quite well! (full details here, but it runs continuously forever: https://github.com/keldenl/gpt-llama.cpp/issues/2#issuecomment-1519287093)

now it's down to specifically getting better responses from vicuna :')

MarkSchmidty commented 1 year ago

@keldenl Have you heard of SuperCOT? It's a LoRA made for use with LangChain. It may work well merged with Vicuna as an Auto-GPT backend.

keldenl commented 1 year ago

@MarkSchmidty i have not! i'm going to go ahead and try it tonight and report the results here~

MarkSchmidty commented 1 year ago

@keldenl The merged versions you can find on HuggingFace are merged with LLaMA. They may perform well. But you can merge with a Vicuna model yourself or keep your eye on this Vicuna/SuperCOT finetune currently in the works: https://huggingface.co/reeducator/vicuna-13b-free/discussions/11#64453a62f993c804b0338fa8

Right now they're merging the two datasets and working on removing things like "As an AI language model, I can't do X" from the Vicuna dataset.

keldenl commented 1 year ago

I'm gonna try out the llama supercot version first, and i'll keep an eye on the vicuna version 👀 . thank you! didn't even hear about supercot before this

MarkSchmidty commented 1 year ago

I forgot, llama.cpp supports applying LoRAs as of a few days ago. There's a GGML version of the SuperCOT LoRA which can be applied to Vicuna-13B at run time here: https://huggingface.co/kaiokendev/SuperCOT-LoRA/tree/main/13b/ggml/cutoff-2048

You might try that as well as the LLaMA merged version.

9cento commented 1 year ago

I forgot, llama.cpp supports applying LoRAs as of a few days ago. There's a GGML version of the SuperCOT LoRA which can be applied to Vicuna-13B at run time here: https://huggingface.co/kaiokendev/SuperCOT-LoRA/tree/main/13b/ggml/cutoff-2048

You might try that as well as the LLaMA merged version.

Hi there, quick question: does this work also with more demanding models like vicuna-13B-4bits-128 and gpt4-x-alpaca or is specifically tailored to llama.ccp/.ccp models in general? Thanks

MarkSchmidty commented 1 year ago

There are SuperCOT LoRAs for both GPU and CPU in 30B, 13B, and 7B sizes, as well as merged models for GPU. Some of them are listed in the readme here: https://huggingface.co/kaiokendev/SuperCOT-LoRA You can use the LoRA with any finetune. But they'll work best with finetunes trained with the same prompting format.

Others are buried in the files here: https://huggingface.co/kaiokendev/SuperCOT-LoRA/tree/main

For GPU 13B you would want one of these two LoRAs, depending on the cutoff length of the finetune you're using it with: https://huggingface.co/kaiokendev/SuperCOT-LoRA/tree/main/13b/gpu (If you don't know, just try both.)

9cento commented 1 year ago

There are SuperCOT LoRAs for both GPU and CPU in 30B, 13B, and 7B sizes, as well as merged models for GPU. Some of them are listed in the readme here: https://huggingface.co/kaiokendev/SuperCOT-LoRA You can use the LoRA with any finetune. But they'll work best with finetunes trained with the same prompting format.

Others are buried in the files here: https://huggingface.co/kaiokendev/SuperCOT-LoRA/tree/main

For GPU 13B you would want one of these two LoRAs, depending on the cutoff length of the finetune you're using it with: https://huggingface.co/kaiokendev/SuperCOT-LoRA/tree/main/13b/gpu (If you don't know, just try both.)

Sorry but I'm relatively new to this stuff so I'll ask you two more questions. First I just would like to know if your method works virtually for every model (not .ccp only, just to be clear) and then if a LoRA is mandatory or optional. Again, forgive my confusion.

MarkSchmidty commented 1 year ago

LoRAs are always optional. They're small files which modify a model. In this case, SuperCOT is a LoRA which modifies a model to make it work with LangChain (and Auto-GPT) better. Any model will work with Auto-GPT. But some work better than others. You can use the SuperCOT LoRAs with any LLaMA based model (including the two you listed). But they're in different formats if they're for cpp or for GPU models. (The different formats are clearly labeled at those links.)

Boostrix commented 1 year ago

Also see: #2158 and #25 or #348 / #347

manyMachines commented 1 year ago

Documentation for Google's PaLM APIs:

https://developers.generativeai.google/api

Would be nice to optionally use their embeddings as well in Auto-GPT.

Currently, for preview users of text-bison-001, the input token limit is 8196, output 1024 and is rate limited to 30 requests a minute.

lc0rp commented 1 year ago

Linking similar LLM-related comments here and closing them:

GPT4ALL:

https://github.com/Significant-Gravitas/Auto-GPT/issues/4501
https://github.com/aorumbayev/autogpt4all (Inspiration)
https://github.com/Significant-Gravitas/Auto-GPT/discussions/34 (discussion)

mudler commented 1 year ago

for anyone that wants to try autogpt locally, I've created an example for LocalAI that you can run with docker-compose easily just in one command: https://github.com/go-skynet/LocalAI/tree/master/examples/autoGPT

there is no need to do any changes to AutoGPT, as it enough to set OPENAI_API_BASE as environment variable to point to the LocalAI instance.

alkeryn commented 1 year ago

@mudler neat, someone also mentioned basaran earlier : https://github.com/hyperonym/basaran so how does it perform in your experience ?

Boostrix commented 1 year ago

The ini file is bit funny given how everything else already uses yaml 👍😉

mudler commented 1 year ago

@mudler neat, someone also mentioned basaran earlier : https://github.com/hyperonym/basaran so how does it perform in your experience ?

I didn't tried basaran, I've only tried with ggml models and LocalAI. With wizardLM and vicuna-cot seems promising. Not at OpenAI levels, but definitely in the good direction!

mudler commented 1 year ago

The ini file is bit funny given how everything else already uses yaml 👍😉

The env file you mean? You can also put the env variables in the docker compose file and make it a one file only :)

xloem commented 1 year ago

It’s great to learn of SuperCOT, but of course any efforts to collect and/or curate data specifically for Auto-GPT will produce an even more powerful model.

isaaclepes commented 12 months ago

I still dream of a day when we can use Petals.dev's API for distributed processing. It even allows you to make your own private swarm. I am picking up old, free computers off Craigslist and adding cheap/free GPU's to them for my private swarm.

haochuan-li commented 10 months ago

I'm new to this. I'm wondering if autogpt is able to call custom url(other than openai api) to get response? So that we can use other serving systems like TGI or vllm to serve our own llm.

DGdev91 commented 10 months ago

I'm new to this. I'm wondering if autogpt is able to call custom url(other than openai api) to get response? So that we can use other serving systems like TGI or vllm to serve our own llm.

Oh, is this thread still open?

Well, it's possible now. Just set the OPENAI_API_BASE variable and you can use any service wich is compliant with OpenAI's API.

....But local LLMs aren't as good as GPT-4 and i never obtained much, even if is technically possible to use them. So i gave up some time ago.

Maybe some recent long-context LLM like llongma and so on can actually work, but i never tried that.

9cento commented 10 months ago

I'm new to this. I'm wondering if autogpt is able to call custom url(other than openai api) to get response? So that we can use other serving systems like TGI or vllm to serve our own llm.

Oh, is this thread still open?

Well, it's possible now. Just set the OPENAI_API_BASE variable and you can use any service wich is compliant with OpenAI's API.

....But local LLMs aren't as good as GPT-4 and i never obtained much, even if is technically possible to use them. So i gave up some time ago.

Maybe some recent long-context LLM like llongma and so on can actually work, but i never tried that.

Did you gave a try to Llama 2? Codellama?

DataBassGit commented 10 months ago

The issue with open source models is that they are trained differently. Getting a response in a specific format requires either fine tuning of the model or modification of the prompts. AutoGPT wasn't designed to make it easy to edit the prompts, and fine tuning is expensive. Eventually, I just built my own agent framework.

chymian commented 10 months ago

One Proxy to rule them all!

https://github.com/BerriAI/litellm/

is a API-Proxy with a vast choice on backends, like replicat, openai, petals, ... and it works like charm. pls implement!

Wladastic commented 6 months ago

One Proxy to rule them all!

https://github.com/BerriAI/litellm/

is a API-Proxy with a vast choice on backends, like replicat, openai, petals, ... and it works like charm. pls implement!

Thank you for your suggestion, I will take a look at this.

alkeryn commented 6 months ago

@Wladastic just so you know textgen webui has a openai like api now, you can look in the wiki on how to force openai to use its api.

you can see how to set up here : https://github.com/oobabooga/text-generation-webui/wiki/12-%E2%80%90-OpenAI-API

you will want to do something like : export OPENAI_API_BASE=http://localhost:5000/v1

Wladastic commented 6 months ago

I know about that one, I was talking about the project mentioned above

Pwuts commented 6 months ago

Coming soon... :)

impredicative commented 5 months ago

This request is more relevant if Claude 3 Opus is actually better than GPT4, at least for some types of tasks.

github-actions[bot] commented 4 months ago

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

GoZippy commented 4 months ago

I got your activity

alkeryn commented 4 months ago

i mean at that point it is pretty easy to tell it to use other OpenAPI compatible backends with the env variables. and there must be proxies to use other providers too ie convert a non OpenAPI api to OpenAPI compatible, if not it's pretty trivial to build.

ntindle commented 3 months ago

We've made lots of progress on this front recently with @Pwuts 's work on additional providers. We can now begin the work on more open providers. We will likely be starting with llamafile then moving out from there

Cattacker commented 1 month ago

这是来自QQ邮箱的假期自动回复邮件。我已收到您的邮件。

Significant-Gravitas / AutoGPT

Support using other/local LLMs #25