yoheinakajima / babyagi

https://babyagi.org/
20.12k stars 2.63k forks source link

Errors every time when using llama #233

Closed AdamJWood closed 6 days ago

AdamJWood commented 1 year ago

Windows 11

I am trying to use llama but the app crashes within a couple minutes regardless of model I use.

I have tried: alpaca-33b-ggml-q4_0-lora-merged ggml-vicuna-13b-4bit gpt4-x-alpaca-13b-ggml-q4_1-from-gptq-4bit-128g llamacpp-7b

I am using the original objective/task:

You are an AI who performs one task based on the following objective: Solve world hunger
.    Take into account these previously completed tasks: []
.    Your task: Develop a task list
Response:
1. Research and gather data on global hunger statistics.
2. Analyze the root causes of hunger, such as poverty, climate change, and political instability.
3. Identify potential solutions, including agricultural advancements, social programs, and economic growth strategies.
4. Brainstorm innovative methods to improve food production, distribution, and accessibility.
5. Collaborate with experts in relevant fields (agriculture, economics, politics) to formulate effective plans.
6. Develop a global strategy to implement and monitor progress towards the goal of eradic

After several iterations (100's+) of llama_print_timings I get this error:

llama_print_timings:        load time =   325.67 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time =   325.62 ms /     2 tokens (  162.81 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time =   325.77 ms
Traceback (most recent call last):
  File "C:\AI\babyagi\babyagi.py", line 468, in <module>
    main()
  File "C:\AI\babyagi\babyagi.py", line 447, in main
    results_storage.add(task, result, result_id, vector)
  File "C:\AI\babyagi\babyagi.py", line 189, in add
    self.collection.add(
  File "C:\Users\Adam\AppData\Local\Programs\Python\Python310\lib\site-packages\chromadb\api\models\Collection.py", line 93, in add
    raise ValueError(
ValueError: Number of embeddings 765 must match number of ids 1

My .env file:

# cp .env.example .env
# Edit your .env file with your own values
# Don't commit your .env file to git/push to GitHub!
# Don't modify/delete .env.example unless adding extensions to the project
# which require new variable to be added to the .env file

# API CONFIG
# OPENAI_API_MODEL can be used instead
# Special values:
# human - use human as intermediary with custom LLMs
# llama - use llama.cpp with Llama, Alpaca, Vicuna, GPT4All, etc
LLM_MODEL=llama
#gpt-3.5-turbo # alternatively, gpt-4, text-davinci-003, etc

LLAMA_MODEL_PATH=C:\AI\oobabooga_windows\text-generation-webui\models\gpt4-x-alpaca-13b-ggml-q4_1-from-gptq-4bit-128g\ggml-model-q4_1.bin

# OPENAI_API_KEY=
# OPENAI_TEMPERATURE=0.0

# STORE CONFIG
# TABLE_NAME can be used instead
RESULTS_STORE_NAME=baby-agi-test-table

# Pinecone config
# Uncomment and fill these to switch from local ChromaDB to Pinecone
# PINECONE_API_KEY=
# PINECONE_ENVIRONMENT=

# COOPERATIVE MODE CONFIG
# BABY_NAME can be used instead
INSTANCE_NAME=BabyAGI
COOPERATIVE_MODE=none # local

# RUN CONFIG
OBJECTIVE=Solve world hunger
# For backwards compatibility
# FIRST_TASK can be used instead of INITIAL_TASK
INITIAL_TASK=Develop a task list

# Extensions
# List additional extension .env files to load (except .env.example!)
DOTENV_EXTENSIONS=
# Set to true to enable command line args support
ENABLE_COMMAND_LINE_ARGS=false
another-ai commented 1 year ago

I had the same error, but only with chromaDB, with Pinecone I haven't that error, but everything is very very slow, it would be nice to be able to use the gpu instead of cpu..

AdamJWood commented 1 year ago

Thank you much, I'll give Pinecone a try. I completely agree about using GPU over CPU; my GPU processes almost 10x as many tokens as my CPU does. 4090 vs 13900K.

AdamJWood commented 1 year ago

Bummer, Pinecone is waitlisting new users and $70/month for standard pricing, ouch. I guess I'll wait. :/

another-ai commented 1 year ago

Bummer, Pinecone is waitlisting new users and $70/month for standard pricing, ouch. I guess I'll wait. :/

I'm a free user... but I imagine that in the future it will be paid for everyone

AdamJWood commented 1 year ago

I am getting good results with OpenAI and Chroma but not with Llama and Chroma. It seems the initial task results are good but every subsequent task fails. The Llama print timings appear to indicate there are only 2 tokens being processed each time.

The initial task llama print timings:

*****USING LLAMA.CPP. POTENTIALLY SLOW.*****

*****OBJECTIVE*****

Solve world hunger

Initial task: Develop a task list

*****TASK LIST*****

 • Develop a task list

*****NEXT TASK*****

Develop a task list

llama_print_timings:        load time =  1736.88 ms
llama_print_timings:      sample time =    13.08 ms /    70 runs   (    0.19 ms per run)
llama_print_timings: prompt eval time =  8812.00 ms /    51 tokens (  172.78 ms per token)
llama_print_timings:        eval time = 52795.69 ms /    69 runs   (  765.15 ms per run)
llama_print_timings:       total time = 61626.16 ms

*****TASK RESULT*****

You are an AI who performs one task based on the following objective: Solve world hunger
.
    Take into account these previously completed tasks: []
.
    Your task: Develop a task list
Response:
Task List:
1. Research current food production methods and their limitations.
2. Identify potential solutions to increase food production.
3. Analyze the feasibility of each solution.
4. Implement the most promising solution.
5. Monitor the effectiveness of the implemented solution.

llama_print_timings:        load time =  2050.49 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time =  2050.44 ms /     2 tokens ( 1025.22 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time =  2050.61 ms

llama_print_timings:        load time =  2050.49 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time =   852.96 ms /     2 tokens (  426.48 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time =   853.10 ms

Then continues 100+ times before erroring out.

alexl83 commented 1 year ago

same issue for me with koala and vicuna model - both 7b and 13 variants - tried quantizations: q4_0 / q4_1 / q4_3

llama_print_timings:        load time =   189.12 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time =   150.69 ms /     2 tokens (   75.35 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time =   150.82 ms

and then error:

Traceback (most recent call last):
  File "/home/alex/AI/babyagi/babyagi.py", line 468, in <module>
    main()
  File "/home/alex/AI/babyagi/babyagi.py", line 447, in main
    results_storage.add(task, result, result_id, vector)
  File "/home/alex/AI/babyagi/babyagi.py", line 189, in add
    self.collection.add(
  File "/home/alex/oobabooga/installer_files/conda/envs/bAGI/lib/python3.10/site-packages/chromadb/api/models/Collection.py", line 93, in add
    raise ValueError(
ValueError: Number of embeddings 761 must match number of ids 1

Anything I can provide as a user to support bugfixing? Thank you for your astounding work!

francip commented 1 year ago

I apologize folks, my default setup runs with Pinecone, so I didn't try the llama+chroma setup. The most likely issue is with calculating the embeddings and a mismatch of the size of the vector. I'll take a look this weekend, but if someone else wants to take a stab at fixing... I'd probably start with llama_embed function.

jmtatsch commented 1 year ago

I have almost the same issue with vicuna 1.1:

babyagi    | *****TASK RESULT*****
babyagi    | 
babyagi    | You are an AI who performs one task based on the following objective: Solve world hunger
babyagi    | .
babyagi    |     Take into account these previously completed tasks: []
babyagi    | .
babyagi    |     Your task: Develop a task list
babyagi    | Response: Task List
babyagi    | ===============
babyagi    | 
babyagi    | Develop a task list for the next 3 months to help achieve the goal of solving world hunger. The task list should include the following:
babyagi    | - Identify and prioritize regions with high levels of food insecurity
babyagi    | - Develop and implement sustainable agriculture practices in these regions
babyagi    | - Coordinate with local organizations and governments to ensure successful implementation
babyagi    | - Monitor progress and adjust plans as necessary
babyagi    | - Share progress and results with the global community to encourage collaboration and support
babyagi    | ```
babyagi    | 
babyagi    | Here is a sample task list for the next 3 months:
babyagi    | - Identify
babyagi    | Traceback (most recent call last):
babyagi    |   File "/app/./babyagi.py", line 496, in <module>
babyagi    |     main()
babyagi    |   File "/app/./babyagi.py", line 474, in main
babyagi    |     results_storage.add(task, result, result_id, vector)
babyagi    |   File "/app/./babyagi.py", line 198, in add
babyagi    |     self.collection.add(
babyagi    |   File "/usr/local/lib/python3.11/site-packages/chromadb/api/models/Collection.py", line 93, in add
babyagi    |     raise ValueError(
babyagi    | ValueError: Number of embeddings 803 must match number of ids 1
DYSIM commented 1 year ago

While working on this error, I had a few observation and found a few bugs using llama and chroma

  1. the function add has wrong type hints def add(self, task: Dict, result: Dict, result_id: int, vector: List): should actually be def add(self, task: Dict, result: str, result_id: str, vector: str):

vector and results are actually the same inputs. so from here, we should not iterate vector and embed, it is the whole string that should be embedded

embeddings = llm_embed.embed(vector) if LLM_MODEL.startswith("llama") else None instead of embeddings = [llm_embed.embed(item) for item in vector] if LLM_MODEL.startswith("llama") else None

  1. Custom Embedding Function Even solving the above would still result in error as self.collection is not initialized with llama embedding function
embedding_function = OpenAIEmbeddingFunction(api_key=OPENAI_API_KEY)

self.collection = chroma_client.get_or_create_collection(
            name=RESULTS_STORE_NAME,
            metadata={"hnsw:space": metric},
            embedding_function=embedding_function,
        )

So my solution here would be to use a Custom Embedding Function for Llama, and just declare that in collection.

Class LlamaEmbeddingFunction(EmbeddingFunction):
    def __init__(self):
        return
    def __call__(self, texts:Documents) -> Embeddings:
        embeddings=[]
        for t in texts:
            e = llm_embed.embed(t)
            embeddings.append(e)
        return embeddings

...

self.collection = chroma_client.get_or_create_collection(
            name=RESULTS_STORE_NAME,
            metadata={"hnsw:space": metric},
            embedding_function=LlamaEmbeddingFunction(),
        )

This solves the issue for me

Adamlangc commented 1 year ago

Any comments on that error: class LlamaEmbeddingFunction(EmbeddingFunction): NameError: name 'EmbeddingFunction' is not defined. Did you mean: 'OpenAIEmbeddingFunction'?

DYSIM commented 1 year ago

Please add the following imports

from chromadb.api.types import Documents, EmbeddingFunction, Embeddings

Adamlangc commented 1 year ago

thanks a lot, well done

dibrale commented 1 year ago

Thank you, @DYSIM - this seems to resolve the issue on my end. I've taken the liberty of turning your suggestion into a pull request.

alexl83 commented 1 year ago

I @dibrale I get this error consistently every time I use llama.cpp with chromadb (I pulled you PR)

*****TASK RESULT*****

You are an AI who performs one task based on the following objective: Find richest the most innovative home appliance manufacturer
.
    Take into account these previously completed tasks: ['Develop a task list']
.
    Your task: \\n    Take into account these previously completed tasks: [\'Develop a task list\']\\n.\\n    Your task: Develop a task list\\nResponse:\\n1. Research and gather information about the top 50 home appliance manufacturers in terms of revenue, including their product lines and innovation strategies.\\n2. Analyze the data to determine which manufacturer has the highest revenue and most innovative products.\\n3. Develop a task list for the selected manufacturer that includes specific tasks related to improving their product line and innovation strategies.\\n4. Prioritize the tasks based on their potential impact on the company\'s revenue and competitive advantage.\\n5. Create a timeline for completing each task, taking into"}.', '    This result was based on this task description: Develop a task list. These are incomplete tasks: .', '    Based on the result, create new tasks to be completed by the AI system that do not overlap with incomplete tasks.', '    Return the tasks as an array.', '```', "const { readLine, writeLine } = require('fs');", "const { parse } = require('csv-parser');", '', 'async function createNewTasks(result) {', '  const tasks = [];', "  let task = '';", '  ', '  for (let i = 0; i < result.length; i++) {', '    if (!result[i].task) {', '      continue;', '    }', '    ', '    task += `Task ${i+1}: ${result[i].task}\\n`;', '    tasks.push(task);', '    ', '    const subtasks ='].
Response:
1. Research and gather information about the top 50 home appliance manufacturers in terms of revenue, including their product lines and innovation strategies.
2. Analyze the data to determine which manufacturer has the highest revenue and most innovative products.
3. Develop a task list for the selected manufacturer that includes specific tasks related to improving their product line and innovation strategies.
4. Prioritize the tasks based on their potential impact on the company's revenue and competitive advantage.
5. Create a timeline for completing each task, taking into

llama_print_timings:        load time =  1278.77 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time =  4741.76 ms /   559 tokens (    8.48 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time =  4742.65 ms
Traceback (most recent call last):
  File "/home/alex/AI/bAGI-temp/babyagi.py", line 490, in <module>
    main()
  File "/home/alex/AI/bAGI-temp/babyagi.py", line 469, in main
    results_storage.add(task, result, result_id, vector)
  File "/home/alex/AI/bAGI-temp/babyagi.py", line 202, in add
    len(self.collection.get(ids=[result_id], include=[])["ids"]) > 0
  File "/home/alex/oobabooga/installer_files/conda/envs/bAGI/lib/python3.10/site-packages/chromadb/api/models/Collection.py", line 137, in get
    return self._client._get(
  File "/home/alex/oobabooga/installer_files/conda/envs/bAGI/lib/python3.10/site-packages/chromadb/api/local.py", line 189, in _get
    db_result = self._db.get(
  File "/home/alex/oobabooga/installer_files/conda/envs/bAGI/lib/python3.10/site-packages/chromadb/db/clickhouse.py", line 423, in get
    val = self._get(where=where_str, columns=columns)
  File "/home/alex/oobabooga/installer_files/conda/envs/bAGI/lib/python3.10/site-packages/chromadb/db/duckdb.py", line 239, in _get
    val = self._conn.execute(
duckdb.ParserException: Parser Error: syntax error at or near "You"
LINE 1: ...e-prioritizing the following tasks: [\'You are a task creation AI that uses th...
dibrale commented 1 year ago

@alexl83 - I've also been encountering this issue, irrespective of whether I used code from @DYSIM or alternative embedding functions in chroma. I therefore don't think it's unique to my PR or a problem with @DYSIM's fix.

Part of the problem appears to be a compounding of markup elements (i.e. quotes, newlines, backslashes and square brackets) that neither get correctly parsed, not stripped after each step. Another part of the problem seems to be self.collecton.get somehow inducing duckdb to attempt execution of the entry it retrieves.

As of this writing, I'm not sure why either of those things are happening, but I'll be taking a look at it to the best of my ability.

alexl83 commented 1 year ago

@dibrale: agreed - I think it's a matter of non-properly escaped characters that bleed into output and db hopefully @francip can have a look at it :)

alexl83 commented 1 year ago

I have a strong hunch that this section is involved:

def task_creation_agent(
    objective: str, result: Dict, task_description: str, task_list: List[str]
):
    prompt = f"""
    You are a task creation AI that uses the result of an execution agent to create new tasks with the following objective: {objective},
    The last completed task has the result: {result}.   
    This result was based on this task description: {task_description}. These are incomplete tasks: {', '.join(task_list)}.
    Based on the result, create new tasks to be completed by the AI system that do not overlap with incomplete tasks.
    Return the tasks as an array."""
    response = openai_call(prompt)
    new_tasks = response.split("\n") if "\n" in response else [response]
    return [{"task_name": task_name} for task_name in new_tasks]

especially this line:

This result was based on this task description: {task_description}. These are incomplete tasks: {', '.join(task_list)}.

Having a lurk @ https://github.com/dory111111/babyagi-streamlit text is more "isolated"

        """Get the response parser."""
        task_creation_template = (
            "You are an task creation AI that uses the result of an execution agent"
            " to create new tasks with the following objective: {objective},"
            " The last completed task has the result: {result}."
            " This result was based on this task description: {task_description}."
            " These are incomplete tasks: {incomplete_tasks}."
            " Based on the result, create new tasks to be completed"
            " by the AI system that do not overlap with incomplete tasks."
            " Return the tasks as an array."
        )
        prompt = PromptTemplate(
            template=task_creation_template,
            partial_variables={"objective": objective},
            input_variables=["result", "task_description", "incomplete_tasks"],

and the prioritization part is stripped/split before being submitted

        """Get the response parser."""
        task_prioritization_template = (
            "You are an task prioritization AI tasked with cleaning the formatting of and reprioritizing"
            " the following tasks: {task_names}."
            " Consider the ultimate objective of your team: {objective}."
            " Do not remove any tasks. Return the result as a numbered list, like:"
            " #. First task"
            " #. Second task"
            " Start the task list with number {next_task_id}."
        )
        prompt = PromptTemplate(
            template=task_prioritization_template,
            partial_variables={"objective": objective},
            input_variables=["task_names", "next_task_id"],
        )
        return cls(prompt=prompt, llm=llm, verbose=verbose)

Just a hunch, pardon me for non-dev language, my coding skills are nonexistant :) Hope it helps!

dibrale commented 1 year ago

Thank you, @alexl83 - I'll have a look at that next. In the meantime, I seem to have gotten the script to run indefinitely by doing some rather nasty string manipulation inside prioritization_agent():

for task_string in new_tasks:
        task_parts = task_string.strip().split(".", 1)
        if len(task_parts) == 2:
            task_id = ''.join(s for s in task_parts[0] if s.isnumeric())
            task_name = ''.join(s for s in task_parts[0] if
                                s.isalnum() or
                                s.isspace() or
                                s == '!' or
                                s == '@' or
                                s == '#' or
                                s == '$' or
                                s == '%' or
                                s == '^' or
                                s == '&' or
                                s == '*' or
                                s == '-' or
                                s == '+' or
                                s == '=' or
                                s == '<' or
                                s == '>' or
                                s == '/' or
                                s == ',' or
                                s == '.')
            new_tasks_list.append({"task_id": task_id, "task_name": task_name})
    tasks_storage.replace(new_tasks_list)

This is obviously terrible. I only post this to illustrate the direction of my testing and highlight the fact that cleaning up the strings seems to be fixing the issue.

alexl83 commented 1 year ago

Great, testing to see how generation with vicuna performs let's hope it will get fixed asap

Thanks @dibrale !

dibrale commented 1 year ago

By way of an update, I've had to do a considerable amount of editing in order to make the existing script play nicely with llama. I hope that I did not break anything on the OpenAI end. Here is my current version in a *.zip file. This version seems to produce legible results that are mostly limited by the quality of my local model, but I'm not quite ready to turn it into a pull request. Feel free to play around with it and turn it into one if you feel it appropriate. In the absence of a more competent fix, I'll be continuing to play with this script over the next day or two.

dibrale commented 1 year ago

I've added a pull request with a refined version of the file from my last comment. It allows the script to operate with llama-based models to the extent these models can remain on task. Please have a look if you are interested and feel free to suggest any changes.

AllyourBaseBelongToUs commented 4 months ago

might be a related issue

Traceback (most recent call last): File "C:\Users\EddyE\Desktop\babyagi-main\babyagi-main\babyagi.py", line 627, in main() File "C:\Users\EddyE\Desktop\babyagi-main\babyagi-main\babyagi.py", line 582, in main result = execution_agent(OBJECTIVE, str(task["task_name"])) File "C:\Users\EddyE\Desktop\babyagi-main\babyagi-main\babyagi.py", line 527, in execution_agent context = context_agent(query=objective, top_results_num=5) File "C:\Users\EddyE\Desktop\babyagi-main\babyagi-main\babyagi.py", line 551, in context_agent results = results_storage.query(query=query, top_results_num=top_results_num) File "C:\Users\EddyE\Desktop\babyagi-main\babyagi-main\babyagi.py", line 258, in query results = self.collection.query( File "C:\Users\EddyE\miniconda3\lib\site-packages\chromadb\api\models\Collection.py", line 220, in query return self._client._query( File "C:\Users\EddyE\miniconda3\lib\site-packages\chromadb\api\segment.py", line 422, in _query self._validate_dimension(coll, len(embedding), update=False) File "C:\Users\EddyE\miniconda3\lib\site-packages\chromadb\api\segment.py", line 533, in _validate_dimension raise InvalidDimensionException( chromadb.errors.InvalidDimensionException: Embedding dimension 5 does not match collection dimensionality 4096

ABZ187 commented 2 months ago

Solved it for Chromadb. metadatas should be list of dictionary and ids should be list of strings. metadatas: list[dict] ids: list[str] ValueError: Number of metadatas 1 must match number of ids 51