babyagi.py is not "activating" (venv on windows)

InfernalDread commented 1 year ago

so, I was wondering why I was getting empty responses after launching everything, so I just ran "python babyagi.py" solo to see what was going on (was using llamacpp), this is the resulting output:

(venv) C:\Users\Mike's PC\Documents\transfer_to_external_storage\Agent_LLM\Agent-LLM>python babyagi.py "find information about ChatGPT and summarize the information in a new text file" Using embedded DuckDB with persistence: data will be stored in: memories/Agent-LLM

OBJECTIVE

find information about ChatGPT and summarize the information in a new text file

Initial task: Develop an initial task list.

Thinking...Response: \ Thinking...Response: | Thinking...Response:

TASK LIST

NEXT TASK

1: Develop an initial task list.

RESULT

ALL TASKS COMPLETE

(venv) C:\Users\Mike's PC\Documents\transfer_to_external_storage\Agent_LLM\Agent-LLM>

The program doesn't actually utilize anything, atleast for me. Not sure why though.

InfernalDread commented 1 year ago

After utilizing the frontend/backend, nothing happens but these outputs, no errors thankfully:

(venv) C:\Users\Mike's PC\Documents\transfer_to_external_storage\Agent_LLM\Agent-LLM>python app.py Using embedded DuckDB with persistence: data will be stored in: memories/Agent-LLM

Serving Flask app 'app'
Debug mode: on WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
Running on http://127.0.0.1:5000 Press CTRL+C to quit
Restarting with stat Using embedded DuckDB with persistence: data will be stored in: memories/Agent-LLM
Debugger is active!
Debugger PIN: 737-166-586 127.0.0.1 - - [19/Apr/2023 00:55:43] "GET /api/docs/ HTTP/1.1" 200 - 127.0.0.1 - - [19/Apr/2023 00:55:43] "GET /api/docs/ HTTP/1.1" 200 - 127.0.0.1 - - [19/Apr/2023 00:55:43] "GET /api/get_agents HTTP/1.1" 200 - 127.0.0.1 - - [19/Apr/2023 00:55:46] "GET /api/get_commands HTTP/1.1" 200 - 127.0.0.1 - - [19/Apr/2023 00:56:24] "OPTIONS /api/set_objective HTTP/1.1" 200 - 127.0.0.1 - - [19/Apr/2023 00:56:24] "POST /api/set_objective HTTP/1.1" 200 - Response: Response: Response:

TASK LIST

127.0.0.1 - - [19/Apr/2023 00:56:24] "GET /api/execute_next_task HTTP/1.1" 200 - 127.0.0.1 - - [19/Apr/2023 00:58:17] "OPTIONS /api/instruct/Agent-LLM/true HTTP/1.1" 404 -

TroublesInParadise commented 1 year ago

Same here. I have tried switching to ooba llama but the first command immediately uses 3000 tokens.

Josh-XT commented 1 year ago

Try deleting the content of your memories folder.

TroublesInParadise commented 1 year ago

Try deleting the content of your memories folder.

this happens when running llamacpp from ooba as well

Electrofried commented 1 year ago

Ok, few issues here.

First, there are too many commands by default, so when the {commands} get sent it instantly blasts the lid off the token limit (should be set to 2048 max for oobabooga or llama.cpp runs). You need to selectively remove any commands you don't need by removing them from the commands folder before running the script.

Try running with just the "file_operations" one to start with. Though you also need to remember that most local models will simply not understand how to use the tools. To put it bluntly they are toddlers beside even GTP3.5. Perhaps a LoRA can help solve that, but that is another topic.

The second major point here, is that currently the prompts for vicuna are not written in the correct format. Each model has its own prompting method that it was trained on. For vicuna this is:

### Human: What is 2+2?
### Assistant:

So you will need to edit the prompts to reflect this.

Eg execute should probably be something like :

### Human: You are an AI who confidently performs one task based on the following objective: {objective}.
Take into account these previously completed tasks: {context}.
Your task to perform confidently: {task}.
This is not a conversation, perform the task and return the results.
### Assistant:

Other models follow different prompts format, so if you use Koala it becomes:

BEGINNING OF CONVERSATION: USER: You are an AI who confidently performs one task based on the following objective: {objective}.
Take into account these previously completed tasks: {context}.
Your task to perform confidently: {task}.
This is not a conversation, perform the task and return the results.
GPT:

And finally, I don't think the llama.cpp or oobabooga providers are configured properly to strip and format responses (and this changes for the type of model loaded also I believe), you may need to investigate setting up those properly once you have the first two issues sorted.

Eventually people will create refined prompts for each model and provider, but for now you may have to do it manually and once you have it working consider doing a pull request.

Josh-XT commented 1 year ago

I added a default model prompt directory in the latest release which should help for this. Prompts per model will need fine tuned, all of the current prompts were tested with GPT3.5 and GPT4.

You can also use the COMMANDS_ENABLED environment variable to False to stop the commands prompt from filling up tokens. You may also need to clear your memories or use a new agent.

Long term, I'm planning to add the ability to choose what commands are enabled on a per-command basis with toggles, but I'm not there just yet. Some time in the next few days.

Josh-XT commented 1 year ago

Checking in - is this still an issue on the latest version?

InfernalDread commented 1 year ago

Checking in - is this still an issue on the latest version?

Oh, I was about to sleep now lol. I'll check as soon as I can and report back to you

InfernalDread commented 1 year ago

Checking in - is this still an issue on the latest version?

After all the changes made, is it still the same way to install? Just wanted to make sure before testing if there was a major change in installation requirements.

Josh-XT commented 1 year ago

Checking in - is this still an issue on the latest version?

After all the changes made, is it still the same way to install? Just wanted to make sure before testing if there was a major change in installation requirements.

I updated the documentation today with a quick start to help with this. Take a look at that and see if that solves your issue.

https://github.com/Josh-XT/Agent-LLM#quick-start

CRCODE22 commented 1 year ago

Checking in - is this still an issue on the latest version?

After all the changes made, is it still the same way to install? Just wanted to make sure before testing if there was a major change in installation requirements.

I updated the documentation today with a quick start to help with this. Take a look at that and see if that solves your issue.

https://github.com/Josh-XT/Agent-LLM#quick-start

Which version of python to use in the Conda environment?

TroublesInParadise commented 1 year ago

An update, ooba works perfectly fine now! Encouraging everyone else to try. Built-in llamacpp seems to have some hardcoded 512 token limit (maybe because of the new message chunks) but works as intended via ooba on both gpu and cpu models. From my testing, llamacpp and bing are the two broken ones at the moment. Unsure about fastchat.

Josh-XT commented 1 year ago

An update, ooba works perfectly fine now! Encouraging everyone else to try. Built-in llamacpp seems to have some hardcoded 512 token limit (maybe because of the new message chunks) but works as intended via ooba on both gpu and cpu models. From my testing, llamacpp and bing are the two broken ones at the moment. Unsure about fastchat.

Thanks for the update! That is the best update I've had yet, I love when things work!

I just pushed an update that should fix the llamacpp token limit issue. I'm still planning on fixing bing soon too, thanks again!

TroublesInParadise commented 1 year ago

An update, ooba works perfectly fine now! Encouraging everyone else to try. Built-in llamacpp seems to have some hardcoded 512 token limit (maybe because of the new message chunks) but works as intended via ooba on both gpu and cpu models. From my testing, llamacpp and bing are the two broken ones at the moment. Unsure about fastchat.

Thanks for the update! That is the best update I've had yet, I love when things work!

I just pushed an update that should fix the llamacpp token limit issue. I'm still planning on fixing bing soon too, thanks again!

You're very welcome! Testing out both as we speak. Just got done with llama, got this:

File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 72, in run
    self.response = self.instruct(prompt)
  File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 12, in instruct
    output = self.llamacpp(f"Q: {prompt}", max_tokens=int(CFG.MAX_TOKENS), stop=["Q:", "\n"], echo=True)
  File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 662, in __call__
    return self.create_completion(
  File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 623, in create_completion
    completion: Completion = next(completion_or_chunks)  # type: ignore
  File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 376, in _create_completion
    prompt_tokens: List[llama_cpp.llama_token] = self.tokenize(
  File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 137, in tokenize
    raise RuntimeError(f'Failed to tokenize: text="{text}" n_tokens={n_tokens}')
RuntimeError: Failed to tokenize: text="b' Q: Task: You are an AI 
who performs one task based on the following objective: Create a

As for Bing, I'll report in a moment in the other thread to keep everything ordered.

TroublesInParadise commented 1 year ago

@Josh-XT I humbly apologize, it seems like the last error log was caused by a tweak I made that I totally forgot about. I'm new to github and programming, just very excited about this stuff.

Update after restore and latest pull:

Exception in thread Thread-17 (run): Traceback (most recent call last): File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 176, in run task = self.execute_next_task() File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 150, in execute_next_task self.response = self.execution_agent(self.primary_objective, this_task_name, this_task_id) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 117, in execution_agent self.response = self.prompter.run(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 72, in run self.response = self.instruct(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 12, in instruct output = self.llamacpp(f"Q: {prompt}", n_ctx=CFG.MAX_TOKENS, stop=["Q:", "\n"], echo=True) TypeError: Llama.call() got an unexpected keyword argument 'n_ctx'

Josh-XT commented 1 year ago

@Josh-XT I humbly apologize, it seems like the last error log was caused by a tweak I made that I totally forgot about. I'm new to github and programming, just very excited about this stuff.

Update after restore and latest pull:

Exception in thread Thread-17 (run): Traceback (most recent call last): File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run self._target(*self._args, self._kwargs) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 176, in run task = self.execute_next_task() File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 150, in execute_next_task self.response = self.execution_agent(self.primary_objective, this_task_name, this_task_id) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 117, in execution_agent self.response = self.prompter.run(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 72, in run self.response = self.instruct(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 12, in instruct output = self.llamacpp(f"Q: {prompt}", n_ctx=CFG.MAX_TOKENS, stop=["Q:", "\n"], echo=True) TypeError: Llama.call**() got an unexpected keyword argument 'n_ctx'

Pull the latest and try again. I passed that parameter in the wrong place, sorry about that! Should be fixed now.

TroublesInParadise commented 1 year ago

@Josh-XT I humbly apologize, it seems like the last error log was caused by a tweak I made that I totally forgot about. I'm new to github and programming, just very excited about this stuff. Update after restore and latest pull: Exception in thread Thread-17 (run): Traceback (most recent call last): File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run self._target(*self._args, self._kwargs) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 176, in run task = self.execute_next_task() File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 150, in execute_next_task self.response = self.execution_agent(self.primary_objective, this_task_name, this_task_id) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 117, in execution_agent self.response = self.prompter.run(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 72, in run self.response = self.instruct(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 12, in instruct output = self.llamacpp(f"Q: {prompt}", n_ctx=CFG.MAX_TOKENS, stop=["Q:", "\n"], echo=True) TypeError: Llama.call**() got an unexpected keyword argument 'n_ctx'

Pull the latest and try again. I passed that parameter in the wrong place, sorry about that! Should be fixed now.

File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\app.py", line 113, in post agent_instances[agent_name] = AgentLLM(agent_name) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 43, in init self.ai_instance = ai_module.AIProvider() File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 9, in init self.llamacpp = Llama(model_path=CFG.MODEL_PATH, n_ctx=CFG.MAX_TOKENS) File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 76, in init self.params.n_ctx = n_ctx TypeError: 'str' object cannot be interpreted as an integer

Josh-XT commented 1 year ago

File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\app.py", line 113, in post agent_instances[agent_name] = AgentLLM(agent_name) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 43, in init self.ai_instance = ai_module.AIProvider() File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 9, in init self.llamacpp = Llama(model_path=CFG.MODEL_PATH, n_ctx=CFG.MAX_TOKENS) File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 76, in init self.params.n_ctx = n_ctx TypeError: 'str' object cannot be interpreted as an integer

Silly types! Okay I just updated it to force an integer.

TroublesInParadise commented 1 year ago

That's okay! I tested it again. Seems to be working but now the model itself apparently seems invalid? It's the same I've used in all my tests so far. Got that same model running with ooba until it hit 3000 tokens and crashed.

Using embedded DuckDB with persistence: data will be stored in: agents/default/memories llama_model_load: loading model from 'C:\Users\Stephan\Desktop\Vicuna\Agent Server\text-generation-webui\models\vicuna-for-agi\ggml-vicuna-13b-1.1-q4_1.bin' - please wait ... llama_model_load: n_vocab = 32000 llama_model_load: n_ctx = 2000 llama_model_load: n_embd = 5120 llama_model_load: n_mult = 256 llama_model_load: n_head = 40 llama_model_load: n_layer = 40 llama_model_load: n_rot = 128 llama_model_load: f16 = 5 llama_model_load: n_ff = 13824 llama_model_load: n_parts = 2 llama_model_load: type = 2 llama_model_load: invalid model file 'C:\Users\Stephan\Desktop\Vicuna\Agent Server\text-generation-webui\models\vicuna-for-agi\ggml-vicuna-13b-1.1-q4_1.bin' (bad f16 value 5) llama_init_from_file: failed to load model llama_generate: seed = 1682156537

mtthw-meyer commented 1 year ago

Also trying to get it working with vicuna. No luck so far. Trying again with the latest.

edit: It doesn't seem to do anything. I give it an objective and nothing seems to happen. I don't see any LLMs starting, just messages about the watchdog restarting stuff. Then the app.py seems to crash out and stop.

Josh-XT commented 1 year ago

Also trying to get it working with vicuna. No luck so far. Trying again with the latest.

edit: It doesn't seem to do anything. I give it an objective and nothing seems to happen. I don't see any LLMs starting, just messages about the watchdog restarting stuff. Then the app.py seems to crash out and stop.

Hello! This issue came up right after we pushed our new front end due to NextJS server behavior. I've replaced Flask with FastAPI today which solves the issue at the same time. If you can pull the latest, that should be fixed.

mtthw-meyer commented 1 year ago

Yes, getting a model loading error now. I'll see what I can do on my end. Likely because pyllamacpp is the nomic version vs the original llama-cpp-python.

TroublesInParadise commented 1 year ago

Yes, getting a model loading error now. I'll see what I can do on my end. Likely because pyllamacpp is the nomic version vs the original llama-cpp-python.

Ditto, now it won't even load a vicuna model that worked before, thinks it's invalid.

mtthw-meyer commented 1 year ago

Trying to use llama_cpp instead with:

from Config import Config

from llama_cpp import Llama

CFG = Config()

class AIProvider:

    def __init__(self):
        if CFG.MODEL_PATH:
            try:
                self.max_tokens = int(CFG.MAX_TOKENS)
            except:
                self.max_tokens = 2000
            self.model = Llama(model_path=CFG.MODEL_PATH,
                               n_ctx=self.max_tokens,
                               n_threads=8)

    def new_text_callback(self, text: str):
        print(text, end="", flush=True)

    def instruct(self, prompt):
        print(f"###Prompt: {prompt}")
        output = yield from self.model.generate(f"{prompt}",
                                                top_k=40,
                                                top_p=0.95,
                                                temp=CFG.AI_TEMPERATURE,
                                                repeat_penalty=1.1)
        print(f"###Output: {output}")
        return output

Gives this error: chromadb/api/types.py", line 98, in validate_metadata raise ValueError(f"Expected metadata value to be a str, int, or float, got {value}") ValueError: Expected metadata value to be a str, int, or float, got <generator object AIProvider.instruct at 0x7fb04c10c660>

Josh-XT commented 1 year ago

This issue has become a little out of control and not on topic anymore. babyagi.py no longer exists. Please create new issues for your issues.

Josh-XT / AGiXT

babyagi.py is not "activating" (venv on windows) #14