garyfeng / Auto-GPT

An experimental open-source attempt to make GPT-4 fully autonomous.
MIT License
0 stars 0 forks source link

Q: How does AutoGPT use the embedding vectors? #9

Open garyfeng opened 1 year ago

garyfeng commented 1 year ago

AutoGPT stores GPT output and other things in 'memory' as embedding vectors for each iteration. Depending on the 'memory backend' you use (local file or redis, etc.) the embedding vectors are stored differently.

Question is what do the embedding vectors do? I presume they help to keep track of the context, but how does autoGPT use the context/memory? It does not send embedding vectors to GPT, does it?

garyfeng commented 1 year ago

from memory/base.py we know it uses text-embedding-ada-002

return openai.Embedding.create(input=[text], model="text-embedding-ada-002")["data"][0]["embedding"]
garyfeng commented 1 year ago

in chat.py, function generate_context() uses relevant_memory as part of the context:

def generate_context(prompt, relevant_memory, full_message_history, model):
    current_context = [
        create_chat_message(
            "system", prompt),
        create_chat_message(
            "system", f"The current time and date is {time.strftime('%c')}"),
        create_chat_message(
            "system", f"This reminds you of these events from your past:\n{relevant_memory}\n\n")]

    # Add messages from the full message history until we reach the token limit
    next_message_to_add_index = len(full_message_history) - 1
    insertion_index = len(current_context)
    # Count the currently used tokens
    current_tokens_used = token_counter.count_message_tokens(current_context, model)
    return next_message_to_add_index, current_tokens_used, insertion_index, current_context

Where do we get relevant_memory? In chat.py, we L72

            relevant_memory = permanent_memory.get_relevant(str(full_message_history[-9:]), 10)

where memory.get_relevant(text, n_memories) is basically a semantic memory retrieval process, where text is the probe against the memory (array or database of embedding venctors). We can find the top k most similar items. See memory/local.py get_relevant(self, text: str, k: int) -> List[Any] for reference.

In the above case, we get the top 10 most relevant memories (as text), based on the more recent 9 items from the full history.

Isn't this a flaw, where the context are the most relevant things to the last 9 items, plus whatever the full_message_history it can fit for the rest of the space? That means we may forget earlier memories?