Closed TechDom closed 7 months ago
Are you using the correct chat format for the model? https://github.com/abetlen/llama-cpp-python/blob/1e60dba082464b8828dca0a6d05a2fe46fcc4f7c/README.md?plain=1#L285
Hi, thanks for your answer. This might indeed make a difference, however, with my current CrewAI setup I am unsure on how to do that. I believe that the 'role', 'goal' and 'backstory' are used by CrewAI to compose a system prompt, which is all the green text in my image above. This means that I have no control over the actual format, or do I?
If you could guide me into how to do that that would be awesome.
How is the Llama 2 model being run?
llama-cpp-python
?If it's llama-cpp-python
then you can invoke it using the --chat-format
option like so:
python3 -m llama_cpp.server --chat_format llama-2 --model ./models/llama-2.gguf
if you are using llama.cpp server, use the --chat-template
option:
./server ./models/llama-2.gguf --chat-template llama2
Read more here
llama-cpp-python
is not used. The model is run using llama.cpp server and does not make use of the --chat-template
option. Reason being is that apart from Llama, there is a bunch of other models running for which chat templates are not supported and can not be added manually (see the link you provided).
Instead, we apply a template manually before passing the final prompt to /completions
endpoint. Would it be possible do to something similar with CrewAI?
Have you seen this? https://github.com/joaomdmoura/crewAI/blob/main/docs/how-to/LLM-Connections.md
You can pass in any LLM instance from langchain. You might be able to derive from one of those models or just implement your own custom langchain llm model. Perhaps someone has done that already with the ChatOpenAI LLM class from Langchain.
That said, we haven't really determined if the behaviour is related to chat templates being wrong yet. It's just the first thing I look at when getting back no response or weird responses from llama.cpp.
You can try adding this to the top of your own source:
from langchain.globals import set_debug
set_debug(True)
See more at https://python.langchain.com/docs/guides/debugging
Thanks! I am looking into how to utilize langchain to get this running better.
Meanwhile, I now get some results from the (unchanged) code above which means that crewAI is successfully prompting the model. I suspect that there are timeout issues (the model is slow), because most of the times I just get [DEBUG]: == [Influencer] Task output: Agent stopped due to iteration limit or time limit
(it didn't show that before or I missed it). I set max_iter = 2 but still the final result is empty whenever I get this warning message. is it possible to increase the timeout as well?
That being said, in debug mode, I do see an Exception popping up. While I am unsure whether this could be a result of an incorrect prompt format, it does seem like crewAI expects a very specific response format and due to an incorrect prompt template, the model fails to provide its answer in that correct format. This would also align with this post.
Hello!
First of all, thanks for crewAI, this looks like an awesome project!
Context:
I am having issues connecting a CrewAI
Agent
with an LLM that is locally installed on a remote server through llama.cpp. I only have access to the model (Llama 2) through API requests, for example:What (almost) seems to work at the moment:
First, I set up the remote URL in the environment as such (URL omitted):
Second, I setup the Agent, a Task and a Crew like this:
This setup does not try to connect to the OpenAI models anymore by default as it did before.
The issue:
While I don't get any errors, it doesn't seem like it is properly connecting to the LLM, as the output does not (appear to) contain any LLM generated response. I would expect some response related to the task "write a single sentence about cats". Here is the full console output when running the above script:
The most desired resolve for this issue would be that the Agent simply calls my own Python function where I can handle the LLM connection myself, including model parameters etc.
Thanks in advance for taking a look!