Open fahdmirza opened 1 month ago
@fahdmirza I will look into this.
@fahdmirza I will look into this.
Thank you, I am doing a review of this for my channel https://www.youtube.com/@fahdmirza , as it looks so promising. Thanks.
Hello @fahdmirza for this you have to install llama-cpp-python with cuda? we have tested now on A100 on HF spaces and work nice I don't understand your problem
your code that share here you are using mistral models with CHATML message format I think you have to use MISTRAL for the chat template
nevertheless check this we have all setup some examples In HF spaces if wanna make a review for your YouTube channel hehe https://huggingface.co/poscye
This would work for you always use the correct chat template for the model
from llama_cpp_agent import LlamaCppAgent
from llama_cpp_agent import MessagesFormatterType
llama_model = Llama(r"mistral-7b-instruct-v0.2.Q5_K_S.gguf", n_batch=1024, n_threads=10, n_gpu_layers=40)
provider = LlamaCppPythonProvider(llama_model)
agent = LlamaCppAgent(provider, system_prompt="You are a helpful assistant.", predefined_messages_formatter_type=MessagesFormatterType.MISTRAL)
agent_output = agent.get_chat_response("Hello, World!")
Hi, On ubuntu 22.04, It just stucks at generating output for hours:
Import the Llama class of llama-cpp-python and the LlamaCppPythonProvider of llama-cpp-agent
from llama_cpp import Llama from llama_cpp_agent.providers import LlamaCppPythonProvider
Create an instance of the Llama class and load the model
llama_model = Llama(r"mistral-7b-instruct-v0.2.Q5_K_S.gguf", n_batch=1024, n_threads=10, n_gpu_layers=40)
Create the provider by passing the Llama class instance to the LlamaCppPythonProvider class
provider = LlamaCppPythonProvider(llama_model)
from llama_cpp_agent import LlamaCppAgent from llama_cpp_agent import MessagesFormatterType
agent = LlamaCppAgent(provider, system_prompt="You are a helpful assistant.", predefined_messages_formatter_type=MessagesFormatterType.CHATML)
agent_output = agent.get_chat_response("Hello, World!")
Stucks here........
I have NVIDIA A6000 GPU and plenty of memory. I have also tried installing llama.cpp from source but still same issue. Any ideas