Closed stoltzmaniac closed 9 months ago
There is no clear way of doing it now, it has the default retry logic provided by langchain, but maybe we could add something to support that, what would be the ideal behavior, read the retry instruction, wait for that and then pick it up again? Also what version of crewAI are you using, the new v0.1.14 should use the newest version of the openAI lib, idk if they have some treatment around that themselves, but they might.
One simple way to deal with this would be to add 5-10 second wait somewhere random in the list of tasks that crewai do, this'll hold off till openai allows a bigger limit. Pro-way to deal with this would be to catch that exception RateLimitError somehow and do exponential backoff, but as you wrote, this is langchain logic, so might be harder. I'm just also wondering if langchain params have anything that might be tweaked.
@joaomdmoura can you see if you could use LimitAwaitChatOpenAI
method from https://github.com/alex4321/langchain-openai-limiter? would that make sense?
One simple way to deal with this would be to add 5-10 second wait somewhere random in the list of tasks that crewai do, this'll hold off till openai allows a bigger limit. Pro-way to deal with this would be to catch that exception RateLimitError somehow and do exponential backoff, but as you wrote, this is langchain logic, so might be harder. I'm just also wondering if langchain params have anything that might be tweaked.
My only problem with this is that it would impact everyone, even people that don't incur on rate limiting but could be an option. Oh I'll check the project you share def looks interesting, I was assuming that was something langchain would have by default but we could mimic that yeah
One suggestion would be to parse the error message itself and extract how long OpenAI is telling the user to wait. Example error message:
Rate limit reached for gpt-4 in organization org-XXXXXXX on tokens_usage_based per min: Limit 10000, Used 6256, Requested 4254. Please try again in 3.06s. Visit https://platform.openai.com/account/rate-limits to learn more.
So the next request should come after 3.06 seconds in this case
@paixaop I think that the library https://github.com/alex4321/langchain-openai-limiter and methods like LimitAwaitOpenAIEmbeddings do exactly what you're describing
Anything work around to for the rate limiter? I am not able to run any of the examples to try this framework out :(
@paixaop I think that the library https://github.com/alex4321/langchain-openai-limiter and methods like LimitAwaitOpenAIEmbeddings do exactly what you're describing
I don't have time to do a deep dive into this, but it looks like this line in crewai is the default OpenAPI factory
llm: Optional[Any] = Field(
default_factory=lambda: ChatOpenAI(
temperature=0.7,
model_name="gpt-4",
),
description="Language model that will run the agent.",
)
If you go back to their docs for agent setup they have this example:
# Define your agents with roles and goals
researcher = Agent(
role='Senior Research Analyst',
goal='Uncover cutting-edge developments in AI and data science in',
backstory="""You work at a leading tech think tank.
Your expertise lies in identifying emerging trends.
You have a knack for dissecting complex data and presenting
actionable insights.""",
verbose=True,
allow_delegation=False,
tools=[search_tool]
# You can pass an optional llm attribute specifying what mode you wanna use.
# It can be a local model through Ollama / LM Studio or a remote
# model like OpenAI, Mistral, Antrophic of others (https://python.langchain.com/docs/integrations/llms/)
#
# Examples:
# llm=ollama_llm # was defined above in the file
# llm=ChatOpenAI(model_name="gpt-3.5", temperature=0.7)
)
is this example they show how to manually set llm
. To mirror what they are doing internally we could do:
llm=ChatOpenAI(model_name="gpt-4", temperature=0.7)
so, if I understand code correctly, you might be able to wrap the created agent? I've not time to test this but maybe something like this could work...
llm=LimitAwaitChatOpenAI(
chat_openai=ChatOpenAI(model_name="gpt-4", temperature=0.7),
limit_await_timeout=60.0,
limit_await_sleep=0.1,
)
I'll give it a try later to see if it works, if not maybe this will get some folks closer... -MM
@sartian That would be amazing if limit_await_timeout=60 works. I need to be able to set a wait timeout, but I haven't hit my limit yet to be able to test if it works or not. Did you get any results from your tests? Much thanks in advance
@sartian I just tried, and it seems like the limiting package is broken in a way that it requires an older version of openapi package, opened them an issue there: https://github.com/alex4321/langchain-openai-limiter/issues/4
I can confirm that this is working now
I need to add new docs around this but great to know its working
For those with limited OpenAI access, the rate limit is hit in the "stock_analysis" example at: https://github.com/joaomdmoura/crewAI-examples?tab=readme-ov-file
*Error: *
Is there a good place within crewAI to add handling for this? If so, please link to the class/function and I will give it a shot. Thanks!