Closed debuggerone closed 2 months ago
Some kind of data storage is needed to keep track of rate limits
Before an agent starts to run we use tiktoken to identify the size of the prompt and we add the desired maxtoken and then we look up the token and requests used in the current minute
if there is at least 1 request left we add the calculated token of the current request to the used token in the current minute - if that is below the rate limit we can start the process
I mentioned this in the discussion thread. I don't think you want to overthink this. For rate limiting it's honestly better to let the error happen and then backoff for the period mentioned in the response header and I believe OpenAI's library does that automatically. I'm happy to have discussion around the various cons with trying to track this stuff in a database.
nah, you are right.. I've removed the database
Subtask Overview
This subtask involves implementing concurrency management using Python’s asyncio to handle parallel tasks effectively during the AgentM migration.
Tasks
asyncio
framework.asyncio
.Acceptance Criteria
asyncio
.Additional Info
Some kind of data storage is needed to keep track of rate limits
Before an agent starts to run we use tiktoken to identify the size of the prompt and we add the desired maxtoken and then we look up the token and requests used in the current minute
if there is at least 1 request left we add the calculated token of the current request to the used token in the current minute - if that is below the rate limit we can start the process