Hattoff / Raven-LK

MIT License
2 stars 0 forks source link

Need feature taht will start a new chat window inside the api #2

Open jmanhype opened 1 year ago

jmanhype commented 1 year ago

There is enough space in the cache... indexing memory (a72c1e82-9082-4beb-944b-eb261b3ba948) saving memory locally... Parenting child memories... Saving vector to pinecone. Error communicating with OpenAI: This model's maximum context length is 8192 tokens. However, your messages resulted in 8219 tokens. Please reduce the length of the messages. Retrying in 20... Error communicating with OpenAI: This model's maximum context length is 8192 tokens. However, your messages resulted in 8219 tokens. Please reduce the length of the messages. Retrying in 30... Error communicating with OpenAI: This model's maximum context length is 8192 tokens. However, your messages resulted in 8219 tokens. Please reduce the length of the messages. Retrying in 40... Error communicating with OpenAI: This model's maximum context length is 8192 tokens. However, your messages resulted in 8273 tokens. Please reduce the length of the messages. Retrying in 20... Error communicating with OpenAI: This model's maximum context length is 8192 tokens. However, your messages resulted in 8273 tokens. Please reduce the length of the messages. Retrying in 30... Error communicating with OpenAI: This model's maximum context length is 8192 tokens. However, your messages resulted in 8273 tokens. Please reduce the length of the messages. Retrying in 40... Generating eidetic memory... Error communicating with OpenAI: 'completion_tokens' Error communicating with OpenAI: 'completion_tokens' Error communicating with OpenAI: 'completion_tokens' Error communicating with OpenAI: 'completion_tokens' adding memory to cache (0) There is enough space in the cache... indexing memory (6758f277-5b83-42ad-b9d2-13b629bf1e2a) saving memory locally... Parenting child memories... Saving vector to pinecone. Saving state... State saved...

Hattoff commented 1 year ago

@jmanhype Sorry for the delay. I will be sure to check my notifications more frequently. Your token settings are just slightly too high and you are running over a bit. The token count may be slightly off depending on the prompt contents and how the API embeds tokens. I would say, as a safe bet, reduce some of your token parameters by about 10% to be safe.

Error communicating with OpenAI: This model's maximum context length is 8192 tokens. However, your messages resulted in 8219 tokens. Please reduce the length of the messages.

You are over by just a few tokens so the API refuses the request. What do you have setup for your max_token_input in the config file?

[open_ai] api_key=api_keys/key_openai.txt model=gpt-3.5-turbo-0301 input_engine=text-embedding-ada-002 max_token_input=4097 ... [memory_management] # The maximum number of tokens per cache leaving room remaining for prompts, instructions, and responses cache_token_limit=1000

The max token input is how the number of tokens the API endpoint will take, seems you already have that set. The cache token limit will determine how many tokens any given cache can hold until it needs to compress. Reducing that from what you have now may help.

And if you look at the ConversationManagement.py file near the top of the code there is an eidetic memory log and episodic memory log which look like this:

class ConversationManager: def __init__(self): self.__config = configparser.ConfigParser() self.__config.read('config.ini') self.__memorymanager = MemoryManager() self.\_eidetic_memorylog = self.MemoryLog(1000,4) self.\_episodic_memory_log = self.MemoryLog(1000,4)

The first two numbers of those MemoryLog objects are the maximum number of tokens the conversation will hold before removing old messages and the minimum number of chat messages which will display. You would want to reduce the maximum number of tokens but keep them equal to or larger than the cache_token_limit in the config file. and potentially reduce the minimum messages.

I will work on consolidating and calculating these tokens more precisely so you don't run into the issue as often and I will implement some failsafe measures to make sure that the conversation log doesn't eat up more than it should if certain messages are particularly long. Apologies. When I get some time this weekend I will prioritize this. Let me know if you are still having issues and I will reach out when I have a few fixes in place.

jmanhype commented 1 year ago

Thank you so very much. Yes i set my config to gpt 4 and 8192 for token limit i am actively trying to store my ecomm site on the bot. So i can have it have it in its memory. That way it can update the site without any problem losing track of certain functions. Question should we when it gets to that limit open a new chat instance that way the limit csn be bypassed.

On Thu, Mar 30, 2023, 10:43 AM Matt Hatton @.***> wrote:

@jmanhype https://github.com/jmanhype Sorry for the delay. I will be sure to check my notifications more frequently. Your token settings are just slightly too high and you are running over a bit. The token count may be slightly off depending on the prompt contents and how the API embeds tokens. I would say, as a safe bet, reduce some of your token parameters by about 10% to be safe.

Error communicating with OpenAI: This model's maximum context length is 8192 tokens. However, your messages resulted in 8219 tokens. Please reduce the length of the messages.

You are over by just a few tokens so the API refuses the request. What do you have setup for your max_token_input in the config file?

[open_ai] api_key=api_keys/key_openai.txt model=gpt-3.5-turbo-0301 input_engine=text-embedding-ada-002 max_token_input=4097 ... [memory_management]

The maximum number of tokens per cache leaving room remaining for

prompts, instructions, and responses cache_token_limit=1000

The max token input is how the number of tokens the API endpoint will take, seems you already have that set. The cache token limit will determine how many tokens any given cache can hold until it needs to compress. Reducing that from what you have now may help.

And if you look at the ConversationManagement.py file near the top of the code there is an eidetic memory log and episodic memory log which look like this:

class ConversationManager: def init(self): self.config = configparser.ConfigParser() self.config.read('config.ini') self.__memory_manager = MemoryManager() self.eidetic_memory_log = self.MemoryLog(1000,4) self.episodic_memory_log = self.MemoryLog(1000,4)

The first two numbers of those MemoryLog objects are the maximum number of tokens the conversation will hold before removing old messages. You would want to reduce them but keep them equal to or larger than the cache_token_limit in the config file.

I will work on consolidating and calculating these tokens more precisely so you don't run into the issue as often. Apologies. When I get some time this weekend I will prioritize this. Let me know if you are still having issues.

— Reply to this email directly, view it on GitHub https://github.com/Hattoff/Raven-LK/issues/2#issuecomment-1490527444, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANMI52GOHKNO2GNVJEQ4NUTW6WSZJANCNFSM6AAAAAAWLDODCI . You are receiving this because you were mentioned.Message ID: @.***>

jmanhype commented 1 year ago

lmk