Stevenic / agentm-py

A library of "Micro Agents" that make it easy to add reliable intelligence to any application.
MIT License
8 stars 3 forks source link

[Subtask] Implement concurrency management with asyncio #8

Closed debuggerone closed 2 months ago

debuggerone commented 2 months ago

Subtask Overview

This subtask involves implementing concurrency management using Python’s asyncio to handle parallel tasks effectively during the AgentM migration.

Tasks

Acceptance Criteria

Additional Info

Some kind of data storage is needed to keep track of rate limits

Before an agent starts to run we use tiktoken to identify the size of the prompt and we add the desired maxtoken and then we look up the token and requests used in the current minute

if there is at least 1 request left we add the calculated token of the current request to the used token in the current minute - if that is below the rate limit we can start the process

Stevenic commented 2 months ago

Some kind of data storage is needed to keep track of rate limits

Before an agent starts to run we use tiktoken to identify the size of the prompt and we add the desired maxtoken and then we look up the token and requests used in the current minute

if there is at least 1 request left we add the calculated token of the current request to the used token in the current minute - if that is below the rate limit we can start the process

I mentioned this in the discussion thread. I don't think you want to overthink this. For rate limiting it's honestly better to let the error happen and then backoff for the period mentioned in the response header and I believe OpenAI's library does that automatically. I'm happy to have discussion around the various cons with trying to track this stuff in a database.

debuggerone commented 2 months ago

nah, you are right.. I've removed the database