Closed iplayfast closed 8 months ago
same here, I am using ollama openhermes2.5-mistral model locally, it loads it fine and works for a bit and then it crashes with this error...
openai.error.RateLimitError: You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.
I was also seeing this message pop up alot between the agent conversations...
Error executing tool. Missing exact 3 pipe (|) separated values.
Hey folks, I'll test it out, I believe it might be because the scrapping tool also uses an agent within itself, it was kind of a hacky way to get a summary of a webpage, so I think setting that to use the local modal should get rid of it, but I'll try as well, I'll add better docs around that in the examples repo.
Error executing tool. Missing exact 3 pipe (|) separated values.
Yup, smaller models seems to struggle with that default tool, I'm doing some updates to make opensource small models work better, should cut a new version with some initial improvements today still.
I am trying to run Ollama locally per the instructions (from my docker container) however for this example, it appears to still be using OpenAI? Not sure, but I can see from the docker logs it's not running the Ollama model. Likewise, I don't see a call in this example for the OpenAI API? Everything appears to be running but not sure how or why? What am I missing?
@joaomdmoura - I've been playing (very roughly) with RAG using text files I've chroma'd. I am getting the following error messages as you refer to above (and also, THANK YOU for creating this as when it has been working before these errors started to appear, the results have been superb!): Agent stopped due to iteration limit or time limit. Missing exact 3 pipe (|) separated values. etc etc As you can see, I'm exclusively using Ollama so no rate limits should apply?
import os import sys from crewai import Agent, Task, Crew, Process from langchain.llms import Ollama from langchain.vectorstores import Chroma from langchain.embeddings import OllamaEmbeddings from langchain.chains import RetrievalQA
ollama_model = Ollama(model="openhermes")
vectorstore_file = 'chromadb/vectorstore.db' vectorstore = Chroma(persist_directory=vectorstore_file, embedding_function=OllamaEmbeddings())
qachain = RetrievalQA.from_chain_type(ollama_model, retriever=vectorstore.as_retriever())
class ResearchAgent(Agent): def perform_task(self, task):
answer = qachain({"query": task.description})
return answer
class WriterAgent(Agent): def perform_task(self, task, research_data=None):
writing_prompt = "Write a detailed reply about: " + research_data if research_data else "No research data provided."
content = ollama_model.generate(writing_prompt)
return content
researcher = ResearchAgent( role='Researcher', goal='Discover new insights using RAG', backstory='A dedicated researcher focused on utilizing RAG for insightful analysis', llm=ollama_model )
writer = WriterAgent( role='Writer', goal='Create engaging content', backstory='An experienced writer known for transforming complex topics into engaging responses', llm=ollama_model )
question = sys.argv[1] if len(sys.argv) > 1 else "Default question"
task1 = Task(description=question, agent=researcher) task2 = Task(description='Write a useful reply based on the responses', agent=writer)
crew = Crew(agents=[researcher, writer], tasks=[task1, task2], process=Process.sequential) result = crew.kickoff()
print(result)
I will try with llama2 this evening and report back. Just thought I'd add some source code for commentary/if useful to identify the rate limiting issues for completely offline LLM issues.
Agent stopped due to iteration limit or time limit.
I'll double check, but iirc this is meant to stop agents from running into a loop, like trying to use an inexistent tool too many times in a row. If you're now yet I'd recommend updating to crewAI v0.1.14, it has some initial improvements around task execution that should make it easier on smaller models, also would recommend using openhermes 2.5 that is supposed to be a better and still small model :)
I had the same issue but I was using colab and changing the files did not affect the runtime. Now I'm running it locally and the error is gone. You have to replace like mentioned the LLMs in every agent including browser tools. Also the embeddings uses OpenAI embeddings, I have changed it to e.g. Cohere but you can use any other embedding function supported by e.g. LangChain
It is not very hard to guess that if it ask you for OpenAI key means something triggered a not configured definition to use ollama. By default it is set to use OpenAI ... just identify which function triggered the not configured agent and set it "\/@_@\/" - I don't think that's an issue. Maybe a feature that they can add when installing crewai it to give you "choice" if you want to set the default llm not to be OpenaAI but to be ollama and then somehow easily to select the model... IDK
I am trying to run Ollama locally per the instructions (from my docker container) however for this example, it appears to still be using OpenAI? Not sure, but I can see from the docker logs it's not running the Ollama model. Likewise, I don't see a call in this example for the OpenAI API? Everything appears to be running but not sure how or why? What am I missing?
Just make sure you set llm to ollama FOR EACH AGENT + CREW, because the default option is to use OpenAI if llm is not specified...
I'll add an option to the crew as well so you can set the LLM once and forget if you want the same llm for all your agents.
Agent stopped due to iteration limit or time limit.
The version I'm launching today will make sure to force an answer out of the LLM in case it hits the max iteration limit.
You will also now be able to set a different max iteration limit per agent using max_iter
.
This should now be fixed, will add instruction around max_iter
in the new dos
Hey @joaomdmoura, did this:
I'll add an option to the crew as well so you can set the LLM once and forget if you want the same llm for all your agents.
End up getting implemented?
Back to the original question, I too am running into errors like:
openai.RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}
When attempting to run completely locally. I'm ensuring that I'm doing something like this:
from langchain_community.llms import Ollama
# ...
llm = Ollama(model='mistral:7b')#, base_url='http://192.168.1.124:11434')
and then I'm passing llm
to each agent, and the crew, as llm=llm
sort of thing. And I'm still getting this error. Do you or @HyperUpscale spot any mistake I'm making?
I'm running crewai[tools]==0.28.1
.
Thanks for your help and time!
Edit: Based on your post here, I'm now passing manager_llm
to crew. I'm still having issues, but I think I'm getting closer to the problem.
Edit Edit: Perhaps its an issue with langchain-ai
? See here
Edit Edit Edit: i ended up solving my problem and captured it as a new issue here: https://github.com/joaomdmoura/crewAI/issues/447
Crew Ai, in its current form is set to use OpenAi keys by default, no matter how you specify Ollama for Local LLM use. The codes showed on the Crew Ai github site and also on youtube do not work. I have spent hours configuring CrewAi to work with Ollama / Mistral as a local LLM. CrewAi will always show an error pointing to OpenAI as other users have pointed above. CrewAI simply does not work at present and leads me to doubt CrewAi as a product.
@staimaster it's because crew use by default the openAI embeddings , tho solve it you need to use another embedder
this comment show how to solve it , crew don't support ollama embeddings , but you can use other local embeddings like hugginface or gpt4all https://github.com/joaomdmoura/crewAI/issues/447#issuecomment-2063711293
edit: @staimaster . @madhudson , now the link works
@staimaster it's because crew use by default the openAI embeddings , tho solve it you need to use another embedder
this comment show how to solve it , crew don't support ollama embeddings , but you can use other local embeddings like hugginface or gpt4all https://github.com/joaomdmoura/crewAI/issues/447#issuecomment-2063711293
Ok thank you for the clarification, please repost the link for where I can use local embeddings like hugginface or gpt4all, as there is nothing on the link above
@staimaster it's because crew use by default the openAI embeddings , tho solve it you need to use another embedder this comment show how to solve it , crew don't support ollama embeddings , but you can use other local embeddings like hugginface or gpt4all https://github.com/joaomdmoura/crewAI/issues/447#issuecomment-2063711293
Ok thank you for the clarification, please repost the link for where I can use local embeddings like hugginface or gpt4all, as there is nothing on the link above
I'm trying to find the same
I'm having this same problem (Crew AI demands an API key for OpenAI even when configured strictly for local LLMs (ollama).
I have less than zero interest paying some amorphous, opaque business entity to handle my private data; it is exactly the thing I'm trying to get away from, across my use of the internet.
Please do us all a favor; either fix this, or stop promoting crew AI as interoperable with local LLMs. I literally bought hardware to run this configuration, after considerable research into the available opensource offerings, watching dozens of instructive videos on youtube, and reading your documentation.
My hardware is more than capable; your software simply fails to work as (extensively) advertised.
Added this in the examples issue https://github.com/joaomdmoura/crewAI-examples/issues/2 I tried the stock analsys example, and it looked like it was working until it needed an OPEN_API key.
Also It expects a key to be in the environment for open ai, I feel it should only look for a key when it is actually going to use it.