Maximilian-Winter / llama-cpp-agent

The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
Other
445 stars 38 forks source link

Multiple models context management like Ollama. #40

Closed svjack closed 2 months ago

svjack commented 2 months ago

With the help of llama-cpp-agent, I can use function calling and json-schema ability of one llama model nearly perfectly. 😊 Given I want to use code-llm like codellama to generate function tools and use hermes-2-pro-mistral-7b to use them as https://github.com/Maximilian-Winter/llama-cpp-agent/blob/master/examples/05_Agents/hermes_2_pro_agent.py do. And may use another llm by llama-cpp-python to take other tasks. If I only have Limited gpu memory ,What's going to disturb me is the lack of model switch ability in llama-cpp-python, which also can see in https://github.com/abetlen/llama-cpp-python/issues/223

Auto model switch and the manage of gpu memory have be done by Ollama, but it lack ability of convenient function tools and json-schema output.

Or you can add a model switch ability in llama-cpp-agent, as https://github.com/abetlen/llama-cpp-python/issues/736 and https://github.com/abetlen/llama-cpp-python/issues/302 say.

How can I tackle this ? Looking forward to your reply. 😊