Closed InfernalDread closed 1 year ago
After utilizing the frontend/backend, nothing happens but these outputs, no errors thankfully:
(venv) C:\Users\Mike's PC\Documents\transfer_to_external_storage\Agent_LLM\Agent-LLM>python app.py Using embedded DuckDB with persistence: data will be stored in: memories/Agent-LLM
TASK LIST
127.0.0.1 - - [19/Apr/2023 00:56:24] "GET /api/execute_next_task HTTP/1.1" 200 - 127.0.0.1 - - [19/Apr/2023 00:58:17] "OPTIONS /api/instruct/Agent-LLM/true HTTP/1.1" 404 -
Same here. I have tried switching to ooba llama but the first command immediately uses 3000 tokens.
Try deleting the content of your memories folder.
Try deleting the content of your memories folder.
this happens when running llamacpp from ooba as well
Ok, few issues here.
First, there are too many commands by default, so when the {commands} get sent it instantly blasts the lid off the token limit (should be set to 2048 max for oobabooga or llama.cpp runs). You need to selectively remove any commands you don't need by removing them from the commands folder before running the script.
Try running with just the "file_operations" one to start with. Though you also need to remember that most local models will simply not understand how to use the tools. To put it bluntly they are toddlers beside even GTP3.5. Perhaps a LoRA can help solve that, but that is another topic.
The second major point here, is that currently the prompts for vicuna are not written in the correct format. Each model has its own prompting method that it was trained on. For vicuna this is:
### Human: What is 2+2?
### Assistant:
So you will need to edit the prompts to reflect this.
Eg execute should probably be something like :
### Human: You are an AI who confidently performs one task based on the following objective: {objective}.
Take into account these previously completed tasks: {context}.
Your task to perform confidently: {task}.
This is not a conversation, perform the task and return the results.
### Assistant:
Other models follow different prompts format, so if you use Koala it becomes:
BEGINNING OF CONVERSATION: USER: You are an AI who confidently performs one task based on the following objective: {objective}.
Take into account these previously completed tasks: {context}.
Your task to perform confidently: {task}.
This is not a conversation, perform the task and return the results.
GPT:
And finally, I don't think the llama.cpp or oobabooga providers are configured properly to strip and format responses (and this changes for the type of model loaded also I believe), you may need to investigate setting up those properly once you have the first two issues sorted.
Eventually people will create refined prompts for each model and provider, but for now you may have to do it manually and once you have it working consider doing a pull request.
I added a default
model prompt directory in the latest release which should help for this. Prompts per model will need fine tuned, all of the current prompts were tested with GPT3.5 and GPT4.
You can also use the COMMANDS_ENABLED
environment variable to False to stop the commands prompt from filling up tokens. You may also need to clear your memories or use a new agent.
Long term, I'm planning to add the ability to choose what commands are enabled on a per-command basis with toggles, but I'm not there just yet. Some time in the next few days.
Checking in - is this still an issue on the latest version?
Checking in - is this still an issue on the latest version?
Oh, I was about to sleep now lol. I'll check as soon as I can and report back to you
Checking in - is this still an issue on the latest version?
After all the changes made, is it still the same way to install? Just wanted to make sure before testing if there was a major change in installation requirements.
Checking in - is this still an issue on the latest version?
After all the changes made, is it still the same way to install? Just wanted to make sure before testing if there was a major change in installation requirements.
I updated the documentation today with a quick start to help with this. Take a look at that and see if that solves your issue.
Checking in - is this still an issue on the latest version?
After all the changes made, is it still the same way to install? Just wanted to make sure before testing if there was a major change in installation requirements.
I updated the documentation today with a quick start to help with this. Take a look at that and see if that solves your issue.
Which version of python to use in the Conda environment?
An update, ooba works perfectly fine now! Encouraging everyone else to try. Built-in llamacpp seems to have some hardcoded 512 token limit (maybe because of the new message chunks) but works as intended via ooba on both gpu and cpu models. From my testing, llamacpp and bing are the two broken ones at the moment. Unsure about fastchat.
An update, ooba works perfectly fine now! Encouraging everyone else to try. Built-in llamacpp seems to have some hardcoded 512 token limit (maybe because of the new message chunks) but works as intended via ooba on both gpu and cpu models. From my testing, llamacpp and bing are the two broken ones at the moment. Unsure about fastchat.
Thanks for the update! That is the best update I've had yet, I love when things work!
I just pushed an update that should fix the llamacpp token limit issue. I'm still planning on fixing bing soon too, thanks again!
An update, ooba works perfectly fine now! Encouraging everyone else to try. Built-in llamacpp seems to have some hardcoded 512 token limit (maybe because of the new message chunks) but works as intended via ooba on both gpu and cpu models. From my testing, llamacpp and bing are the two broken ones at the moment. Unsure about fastchat.
Thanks for the update! That is the best update I've had yet, I love when things work!
I just pushed an update that should fix the llamacpp token limit issue. I'm still planning on fixing bing soon too, thanks again!
You're very welcome! Testing out both as we speak. Just got done with llama, got this:
File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 72, in run
self.response = self.instruct(prompt)
File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 12, in instruct
output = self.llamacpp(f"Q: {prompt}", max_tokens=int(CFG.MAX_TOKENS), stop=["Q:", "\n"], echo=True)
File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 662, in __call__
return self.create_completion(
File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 623, in create_completion
completion: Completion = next(completion_or_chunks) # type: ignore
File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 376, in _create_completion
prompt_tokens: List[llama_cpp.llama_token] = self.tokenize(
File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 137, in tokenize
raise RuntimeError(f'Failed to tokenize: text="{text}" n_tokens={n_tokens}')
RuntimeError: Failed to tokenize: text="b' Q: Task: You are an AI
who performs one task based on the following objective: Create a
As for Bing, I'll report in a moment in the other thread to keep everything ordered.
@Josh-XT I humbly apologize, it seems like the last error log was caused by a tweak I made that I totally forgot about. I'm new to github and programming, just very excited about this stuff.
Update after restore and latest pull:
Exception in thread Thread-17 (run): Traceback (most recent call last): File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 176, in run task = self.execute_next_task() File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 150, in execute_next_task self.response = self.execution_agent(self.primary_objective, this_task_name, this_task_id) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 117, in execution_agent self.response = self.prompter.run(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 72, in run self.response = self.instruct(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 12, in instruct output = self.llamacpp(f"Q: {prompt}", n_ctx=CFG.MAX_TOKENS, stop=["Q:", "\n"], echo=True) TypeError: Llama.call() got an unexpected keyword argument 'n_ctx'
@Josh-XT I humbly apologize, it seems like the last error log was caused by a tweak I made that I totally forgot about. I'm new to github and programming, just very excited about this stuff.
Update after restore and latest pull:
Exception in thread Thread-17 (run): Traceback (most recent call last): File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run self._target(*self._args, self._kwargs) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 176, in run task = self.execute_next_task() File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 150, in execute_next_task self.response = self.execution_agent(self.primary_objective, this_task_name, this_task_id) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 117, in execution_agent self.response = self.prompter.run(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 72, in run self.response = self.instruct(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 12, in instruct output = self.llamacpp(f"Q: {prompt}", n_ctx=CFG.MAX_TOKENS, stop=["Q:", "\n"], echo=True) TypeError: Llama.call**() got an unexpected keyword argument 'n_ctx'
Pull the latest and try again. I passed that parameter in the wrong place, sorry about that! Should be fixed now.
@Josh-XT I humbly apologize, it seems like the last error log was caused by a tweak I made that I totally forgot about. I'm new to github and programming, just very excited about this stuff. Update after restore and latest pull: Exception in thread Thread-17 (run): Traceback (most recent call last): File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run self._target(*self._args, self._kwargs) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 176, in run task = self.execute_next_task() File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 150, in execute_next_task self.response = self.execution_agent(self.primary_objective, this_task_name, this_task_id) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\babyagi.py", line 117, in execution_agent self.response = self.prompter.run(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 72, in run self.response = self.instruct(prompt) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 12, in instruct output = self.llamacpp(f"Q: {prompt}", n_ctx=CFG.MAX_TOKENS, stop=["Q:", "\n"], echo=True) TypeError: Llama.call**() got an unexpected keyword argument 'n_ctx'
Pull the latest and try again. I passed that parameter in the wrong place, sorry about that! Should be fixed now.
File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\app.py", line 113, in post agent_instances[agent_name] = AgentLLM(agent_name) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 43, in init self.ai_instance = ai_module.AIProvider() File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 9, in init self.llamacpp = Llama(model_path=CFG.MODEL_PATH, n_ctx=CFG.MAX_TOKENS) File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 76, in init self.params.n_ctx = n_ctx TypeError: 'str' object cannot be interpreted as an integer
File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\app.py", line 113, in post agent_instances[agent_name] = AgentLLM(agent_name) File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\AgentLLM.py", line 43, in init self.ai_instance = ai_module.AIProvider() File "C:\Users\Stephan\Desktop\Vicuna\Agent-LLM\provider\llamacpp.py", line 9, in init self.llamacpp = Llama(model_path=CFG.MODEL_PATH, n_ctx=CFG.MAX_TOKENS) File "C:\Users\Stephan\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 76, in init self.params.n_ctx = n_ctx TypeError: 'str' object cannot be interpreted as an integer
Silly types! Okay I just updated it to force an integer.
That's okay! I tested it again. Seems to be working but now the model itself apparently seems invalid? It's the same I've used in all my tests so far. Got that same model running with ooba until it hit 3000 tokens and crashed.
Using embedded DuckDB with persistence: data will be stored in: agents/default/memories llama_model_load: loading model from 'C:\Users\Stephan\Desktop\Vicuna\Agent Server\text-generation-webui\models\vicuna-for-agi\ggml-vicuna-13b-1.1-q4_1.bin' - please wait ... llama_model_load: n_vocab = 32000 llama_model_load: n_ctx = 2000 llama_model_load: n_embd = 5120 llama_model_load: n_mult = 256 llama_model_load: n_head = 40 llama_model_load: n_layer = 40 llama_model_load: n_rot = 128 llama_model_load: f16 = 5 llama_model_load: n_ff = 13824 llama_model_load: n_parts = 2 llama_model_load: type = 2 llama_model_load: invalid model file 'C:\Users\Stephan\Desktop\Vicuna\Agent Server\text-generation-webui\models\vicuna-for-agi\ggml-vicuna-13b-1.1-q4_1.bin' (bad f16 value 5) llama_init_from_file: failed to load model llama_generate: seed = 1682156537
Also trying to get it working with vicuna. No luck so far. Trying again with the latest.
edit: It doesn't seem to do anything. I give it an objective and nothing seems to happen. I don't see any LLMs starting, just messages about the watchdog restarting stuff. Then the app.py seems to crash out and stop.
Also trying to get it working with vicuna. No luck so far. Trying again with the latest.
edit: It doesn't seem to do anything. I give it an objective and nothing seems to happen. I don't see any LLMs starting, just messages about the watchdog restarting stuff. Then the app.py seems to crash out and stop.
Hello! This issue came up right after we pushed our new front end due to NextJS server behavior. I've replaced Flask with FastAPI today which solves the issue at the same time. If you can pull the latest, that should be fixed.
Yes, getting a model loading error now. I'll see what I can do on my end. Likely because pyllamacpp is the nomic version vs the original llama-cpp-python.
Yes, getting a model loading error now. I'll see what I can do on my end. Likely because pyllamacpp is the nomic version vs the original llama-cpp-python.
Ditto, now it won't even load a vicuna model that worked before, thinks it's invalid.
Trying to use llama_cpp instead with:
from Config import Config
from llama_cpp import Llama
CFG = Config()
class AIProvider:
def __init__(self):
if CFG.MODEL_PATH:
try:
self.max_tokens = int(CFG.MAX_TOKENS)
except:
self.max_tokens = 2000
self.model = Llama(model_path=CFG.MODEL_PATH,
n_ctx=self.max_tokens,
n_threads=8)
def new_text_callback(self, text: str):
print(text, end="", flush=True)
def instruct(self, prompt):
print(f"###Prompt: {prompt}")
output = yield from self.model.generate(f"{prompt}",
top_k=40,
top_p=0.95,
temp=CFG.AI_TEMPERATURE,
repeat_penalty=1.1)
print(f"###Output: {output}")
return output
Gives this error: chromadb/api/types.py", line 98, in validate_metadata raise ValueError(f"Expected metadata value to be a str, int, or float, got {value}") ValueError: Expected metadata value to be a str, int, or float, got <generator object AIProvider.instruct at 0x7fb04c10c660>
This issue has become a little out of control and not on topic anymore. babyagi.py no longer exists. Please create new issues for your issues.
so, I was wondering why I was getting empty responses after launching everything, so I just ran "python babyagi.py" solo to see what was going on (was using llamacpp), this is the resulting output:
(venv) C:\Users\Mike's PC\Documents\transfer_to_external_storage\Agent_LLM\Agent-LLM>python babyagi.py "find information about ChatGPT and summarize the information in a new text file" Using embedded DuckDB with persistence: data will be stored in: memories/Agent-LLM
OBJECTIVE
find information about ChatGPT and summarize the information in a new text file
Initial task: Develop an initial task list.
TASK LIST
TASK LIST
NEXT TASK
1: Develop an initial task list.
RESULT
ALL TASKS COMPLETE
(venv) C:\Users\Mike's PC\Documents\transfer_to_external_storage\Agent_LLM\Agent-LLM>
The program doesn't actually utilize anything, atleast for me. Not sure why though.