Closed kenfink closed 1 year ago
Here's a workaround that seems to work for now. Totally inappropriate for future growth so I'm not creating a PR for this. But as they say, a stupid idea that works isn't stupid.
in /llms/openai.py Line 23 is: openai.api_key = api_key add line 24 beneath it: openai.api_base = api_key
Then in the web UI make the OpenAI API Key http://YOUR.HOST.IP:PORT/v1 Be sure to leave off the slash at the end of the endpoint - the backend server adds it back in.
I tried this and it doesn't work, it just keeps "thinking" for extremely long with no response. Msg me on discord Kita#7214
I'm still having trouble running the project, but I thought a simple solution would be:
def __init__(self, api_key, image_model=None, model="gpt-4", temperature=0.6, max_tokens=4032, top_p=1,
frequency_penalty=0,
presence_penalty=0, number_of_results=1):
openai.api_base = os.getenv("OPENAI_API_BASE", default="https://api.openai.com/v1")
Then one could simply export an environmental variable to set the URL base path, or if there isn't a variable set then it uses the default URL.
I added a static method to the AgentExecutor class: superagi/jobs/agent_executor.py: line 91
@staticmethod
def get_model_api_base_url():
base_url = get_config("OPENAI_API_BASE_URL")
# shell_url = os.getenv("OPENAI_API_BASE_URL")
return base_url
Then updated the OpenAI class' initialization function to include the base_url parameter class: superagi/llms/openai.py: line 11
def __init__(self, api_key, api_base="https://api.openai.com/v1", image_model=None, model="gpt-4", temperature=0.6, max_tokens=4032, top_p=1,
frequency_penalty=0,
presence_penalty=0, number_of_results=1):
openai.api_base = api_base
Then when calling the executor agent, just add in the parameter: agent_executor.py: about line 151
spawned_agent = SuperAgi(ai_name=parsed_config["name"], ai_role=parsed_config["description"],
llm=OpenAi(api_base=AgentExecutor.get_model_api_base_url(), model=parsed_config["model"], api_key=model_api_key), tools=tools, memory=memory,
agent_config=parsed_config)
Finally in the config.yaml file line 5:
OPENAI_API_BASE_URL: https://api.openai.com/v1
I'm still trying to run the project, but that's an improvement because now one can set a setting in the config.yaml file called OPENAI_API_BASE_URL to change the target url of the openai api.
Okay, I got to work with text-generation-web ui. The solution is a bit hacky atm, but the agent is using a local ggml model that is being executed off of multiple GPUs. The above solution definitely works. In order to get it to work with TGWUI though I had to make the openai api run on my computer's LAN interface as getting the docker image to access port 5001 on host machines loop back interface was hard. To make text generation web ui run on the LAN interface I edited the extensions/openai/script.py file and added the following:
Below the import statements, line 17 I added:
ipsocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
ipsocket.connect(("8.8.8.8", 80))
localip = ipsocket.getsockname()[0]
This creates a variable containing the host machine's primary network interfaces' ip address. On line or around line 762 you will find the following line:
server_addr = ('0.0.0.0' if shared.args.listen else '127.0.0.1', params['port'])
Change it to use the ip variable instead of the static string 127.0.0.1:
server_addr = ('0.0.0.0' if shared.args.listen else localip, params['port'])
Please note that this is a quick and dirty solution to use local LLMs. A much better solution would be to add llama-cpp-python functionality to the app, and to create a settings interface for use with llama-cpp-python. I guess I can work on that next. For now though, that is a quick way to get SuperAGI to use local LLMs.
Okay, using the text generation webui seems to be running into errors parsing JSON from SuperAGI. Going to need to do some more investigating here. This error though is off topic of this issue. As it stands, using local LLMs can be done by editing the aforementioned files in the SuperAGI project.
I added a static method to the AgentExecutor class: superagi/jobs/agent_executor.py: line 91
@staticmethod def get_model_api_base_url(): base_url = get_config("OPENAI_API_BASE_URL") # shell_url = os.getenv("OPENAI_API_BASE_URL") return base_url
Then updated the OpenAI class' initialization function to include the base_url parameter class: superagi/llms/openai.py: line 11
def __init__(self, api_key, api_base="https://api.openai.com/v1", image_model=None, model="gpt-4", temperature=0.6, max_tokens=4032, top_p=1, frequency_penalty=0, presence_penalty=0, number_of_results=1): openai.api_base = api_base
Then when calling the executor agent, just add in the parameter: agent_executor.py: about line 151
spawned_agent = SuperAgi(ai_name=parsed_config["name"], ai_role=parsed_config["description"], llm=OpenAi(api_base=AgentExecutor.get_model_api_base_url(), model=parsed_config["model"], api_key=model_api_key), tools=tools, memory=memory, agent_config=parsed_config)
Finally in the config.yaml file line 5:
OPENAI_API_BASE_URL: https://api.openai.com/v1
I'm still trying to run the project, but that's an improvement because now one can set a setting in the config.yaml file called OPENAI_API_BASE_URL to change the target url of the openai api.
Okay, I created a PR with a merging of Text Generation Web UI to manage local hosted language models. This PR creates a docker image for TGWUI, and adds settings to use it in the configuration file. Local LLMs are a go!
Here's another option: The Fastchat folks published this today https://lmsys.org/blog/2023-06-09-api-server/
I'd recommend we stick with the name OPENAI_API_BASE
rather than OPENAI_API_BASE_URL
because the former is the standard for Langchain.
Absolutely, I'll post that change on my next commit.
Maybe it would be worth opening a separate smaller PR than #289 so people can use this Base URL change sooner? I'm happy to do that. I just applied @sirajperson 's patches from https://github.com/TransformerOptimus/SuperAGI/issues/243#issuecomment-1583275247 locally and they work great!
@alexkreidler On my fork I have began to implement locally run LLMs. The fork is currently under development, and is not ready to be merged yet. It would be great if you could create a separate PR. Thanks for the help!
Please consider the following: add Django to the end of the requirements.txt file:
Django==4.2.2
Add an import statement to line 6 of agent_executor.py:
from django.core.validators import URLValidator
from django.core.exceptions import ValidationError
Allow the get_agent_api() method to validate the supplied URL or return the default OpenAI API base.
@staticmethod
def get_agent_api_base():
base_url = get_config("OPENAI_API_BASE")
# shell_url = os.getenv("OPENAI_API_BASE")
url_validator = URLValidator(verify_exists=False)
try:
url_validator(base_url)
except ValidationError as e:
return "https://api.openai.com/v1"
return base_url
and finally modify the function name when called in the execute_next_action function on or around line 160 to:
spawned_agent = SuperAgi(ai_name=parsed_config["name"], ai_role=parsed_config["description"],
llm=OpenAi(api_base=AgentExecutor.get_agent_api_base(), model=parsed_config["model"], api_key=model_api_key), tools=tools, memory=memory,
agent_config=parsed_config)
The following lines in the OpenAI class in openai.py can remain the same:
def __init__(self, api_key, api_base="https://api.openai.com/v1", image_model=None, model="gpt-4", temperature=0.6, max_tokens=4032, top_p=1,
frequency_penalty=0,
presence_penalty=0, number_of_results=1):
openai.api_base = api_base
This will make the use of a custom base URL more robust.
@sirajperson Hello Jonathan! I hope you're doing well. I'm sorry for using this one instead of creating a new one, but I have searched high and low and can't find a solution... I have tried using Super AGI+ OogaBooga on the backend with the dockerized version by Atinoda, as you've pointed out here
However, no matter what I do -> If I build the image cloning their repo or if I copy the "text-generation-webui" an try building the image locally, I always get the same error on Super AGI powershell:
"(host='super__tgwui', port=5001): Max retries exceeded with url: /v1/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7f4de27160>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))"
If I use the command "Test-NetConnection 127.0.0.1 -p 5001" on powershell, it returns true, so it means the port is open.
I have even uninstalled docker and installed again, but I'm still facing this... Do you have any Idea of what I might be doing wrong?
Thanks a lot!
@eriksonssilva If you place the models that you would like to use under:
SuperAGI/tgui/config/models/
They will be copied to the container to be used in tgui. Presently only GPTQ models with a context length greater than 4096 tokens are working.
The line "(host='super__tgwui', port=5001): tells SuperAGI to use the docker bridge and host name resolution. In order to get GPU support you will need to follow the docker instructions for setting up the target machine to use the docker image. Those instructions can be found here
@sirajperson Thanks for the quick answer! So I've been messing around here and I KINDA made it "work"... I am using Oogabooga but without docker... Basically I've used the openai extension and after a lot of trial and error, using my IPV4 instead of localhost or 127.0.0.1, it stopped giving this error. However now, something weirder happens... When I start the Super AGI agent, it will only repeat the same thing over and over... I have used the models that are available on the link you mentioned, and I can chat with the model through the webui without issues. On the other hand, the Super AGI does not go anywhere: The output shows things like: "Exception: When loading characters/instruction-following/None.yaml: FileNotFoundError(2, 'No such file or directory')"
"Warning: Loaded default instruction-following template for model. Warning: Ignoring max_new_tokens (3250), too large for the remaining context. Remaining tokens: 1168 Warning: Set max_new_tokens = 1168"
And on Super AGI the answers are always "vague".
Also, each command takes A LOT of time...
As a means of test I've set the goal: "List 10 mind-boggling movies" and the instructions "Use google to find the movies.".
This might not be 100% related to Super AGI, could you (or perhaps anyone?) give me any hint?
@eriksonssilva Bravo Erik. That's definitely progress. What model are you using? Also, are you using llama-cpp to off load layers to your GPU? I can tell you that the some of the llama model's are just not that great yet. I have been working on trying to get MPT-30B as the brain behind the dropin api end point because it's instruct capabilities are starting to really deliver the quality responses that would make it usable.
@sirajperson If I try using MTP30b I think my computer will standup and walk away from the room! "Nah, dude. You're expecting too much from me" lol I have tried with llama-7b-4bit and TheBloke_open-llama-7b-open-instruct-GPTQ, but both produce similar results. I can only use llamma.cpp on the "llama-7b-4bit". For some reason the other one does not allow me to use it... It's funny that not even GPT3.5 turbo is giving me satisfying results (for more complex tasks), but I must admit that each time I refresh the "usage page" and I see that the cost is increasing, I start sweating! haha
@alexkreidler Yeah, those models don't have a very high perplexity score. If you are able to use the gptq models you should try mpt 30b gptq. What I've been doing, because my two old M40s can't run GPTQ models, is renting cheaper gpu instances at runpod.io and running them there. But please try out the MPT30b and share your results. Also be aware that MPT30b has special message termination characters. That will have to be configured in the constraints section of the agent.
@alexkreidler This also happened on the 20th. So it may be possible to use this to inference MPT ggml based models from the GPU: https://postgresml.org/blog/announcing-gptq-and-ggml-quantized-llm-support-for-huggingface-transformers
The discussion on this issue is lingering from being able to use a different end point for the the API to how to get a local LLM working for task agents. Please refer to #542 for discussions on task agent functionality.
As it stands, the use of a different API endpoint is working correctly. Because one can point the task agent to any API endpoint weather or not the endpoint selected will work with the task agent is beyond the scope of this issue. I'm hoping this issue will be closed soon, since the alternative endpoint improvement is working great.
@TransformerOptimus I was wondering if you could close this problem since the OPENAI_API_BASE implementation is working without fault.
Awesome @sirajperson . Closing this.
Can we further support configure this in the web GUI? restarting the app whenever you want to change the OPENAI_API_BASE is very time consuming.
please add better documentation for this in the help
Add support for the OPENAI_API_BASE endpoint environment variable. Ideally add input for "OpenAI API Endpoint" in GUI TopBar / Settings under "Open-AI API Key"
This is important for NOW because it will allow us to point to any OpenAI API Compatible drop-in.
Use-case examples: ChatGPT-to-API for those of us that don't have GPT4 API access or want to use the Plus membership instead of per-token costs for 3.5-Turbo. llama-cpp-python provides a drop-in OpenAPI compatible API endpoint. Oobabooga provides an OpenAPI compatible API endpoint plugin.
Realizing that in the future this project may likely have direct support for all sorts of local models and various APIs, this will enable a lot of flexible testing until then.
Related Feature request: Each agent should have its own OPENAI_API_KEY and OPENAI_API_BASE. This may already be baked into the plans for enabling various LLMs, since each will have various settings. But here's a currently useful use case: Agent 1 points at localhost:PORT1 for Gorilla, key="model-name", Agent 2 points at localhost:PORT2 for StarCoder, key="other-model", Agent 3 points at api.openai.com for paid Inference.