The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
Other
445
stars
38
forks
source link
Multiple models context management like Ollama. #40
With the help of llama-cpp-agent, I can use function calling and json-schema ability of one llama model nearly perfectly. 😊 Given I want to use code-llm like codellama to generate function tools and use hermes-2-pro-mistral-7b to use them as https://github.com/Maximilian-Winter/llama-cpp-agent/blob/master/examples/05_Agents/hermes_2_pro_agent.py do. And may use another llm by llama-cpp-python to take other tasks. If I only have Limited gpu memory ,What's going to disturb me is the lack of model switch ability in llama-cpp-python, which also can see in https://github.com/abetlen/llama-cpp-python/issues/223
Auto model switch and the manage of gpu memory have be done by Ollama, but it lack ability of convenient function tools and json-schema output.
Or you can add a model switch ability in llama-cpp-agent, as https://github.com/abetlen/llama-cpp-python/issues/736 and https://github.com/abetlen/llama-cpp-python/issues/302 say.
How can I tackle this ? Looking forward to your reply. 😊