Maximilian-Winter / llama-cpp-agent

The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
Other
501 stars 51 forks source link

Where's the RAGColbertReranker usage in llama-cpp-agent? #37

Open svjack opened 7 months ago

svjack commented 7 months ago

Thanks this bravo project, that can give easy format constrain of llama-cpp. And where's the RAGColbertReranker usage ? That I can try a usage example on agent ability with the help of llama-cpp-agent. 😊

Maximilian-Winter commented 7 months ago

I'm gonna add an example today, I have to add some comments first.

Maximilian-Winter commented 7 months ago

@svjack Added example in readme and in the examples folder.

svjack commented 7 months ago

@svjack Added example in readme and in the examples folder.

I try the demo in readme Structured Output by replace main_model by qwen1.5 14b

import llama_cpp
import llama_cpp.llama_tokenizer

main_model = llama_cpp.Llama.from_pretrained(
    repo_id="Qwen/Qwen1.5-14B-Chat-GGUF",
    filename="*q4_0.gguf",
    tokenizer=llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained("Qwen/Qwen1.5-14B"),
    verbose=False,
    n_gpu_layers = -1,
    n_ctx = 3060
)

This yield error of

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[22], line 4
      1 structured_output_agent = StructuredOutputAgent(llama, debug_output=True)
      3 text = """The Feynman Lectures on Physics is a physics textbook based on some lectures by Richard Feynman, a Nobel laureate who has sometimes been called "The Great Explainer". The lectures were presented before undergraduate students at the California Institute of Technology (Caltech), during 1961–1963. The book's co-authors are Feynman, Robert B. Leighton, and Matthew Sands."""
----> 4 print(structured_output_agent.create_object(Book, text))

File /environment/miniconda3/lib/python3.10/site-packages/llama_cpp_agent/structured_output_agent.py:215, in StructuredOutputAgent.create_object(self, model, data)
    204 """
    205 Creates an object of the given model from the given data.
    206 
   (...)
    212     object: The created object.
    213 """
    214 if model not in self.grammar_cache:
--> 215     grammar, documentation = generate_gbnf_grammar_and_documentation(
    216         [model],
    217         model_prefix="Response Model",
    218         fields_prefix="Response Model Field",
    219     )
    221     self.grammar_cache[model] = grammar, documentation
    222 else:

File /environment/miniconda3/lib/python3.10/site-packages/llama_cpp_agent/gbnf_grammar_generator/gbnf_grammar_from_pydantic_models.py:1451, in generate_gbnf_grammar_and_documentation(pydantic_model_list, outer_object_name, outer_object_content, model_prefix, fields_prefix, list_of_outputs, documentation_with_field_description, add_inner_thoughts, allow_only_inner_thoughts, inner_thoughts_field_name, add_request_heartbeat, request_heartbeat_field_name, request_heartbeat_models)
   1415 def generate_gbnf_grammar_and_documentation(
   1416     pydantic_model_list,
   1417     outer_object_name: str | None = None,
   (...)
   1428     request_heartbeat_models: List[str] = None,
   1429 ):
   1430     """
   1431     Generate GBNF grammar and documentation for a list of Pydantic models.
   1432 
   (...)
   1449         tuple: GBNF grammar string, documentation string.
   1450     """
-> 1451     documentation = generate_text_documentation(
   1452         copy(pydantic_model_list),
   1453         model_prefix,
   1454         fields_prefix,
   1455         documentation_with_field_description=documentation_with_field_description,
   1456     )
   1457     grammar = generate_gbnf_grammar_from_pydantic_models(
   1458         pydantic_model_list,
   1459         outer_object_name,
   (...)
   1467         request_heartbeat_models,
   1468     )
   1469     grammar = remove_empty_lines(grammar + get_primitive_grammar(grammar))

File /environment/miniconda3/lib/python3.10/site-packages/llama_cpp_agent/gbnf_grammar_generator/gbnf_grammar_from_pydantic_models.py:1116, in generate_text_documentation(pydantic_models, model_prefix, fields_prefix, documentation_with_field_description)
   1112             if isclass(element_type) and issubclass(
   1113                 element_type, BaseModel
   1114             ):
   1115                 pyd_models.append((element_type, False))
-> 1116 if isclass(field_type) and issubclass(field_type, BaseModel):
   1117     pyd_models.append((field_type, False))
   1118 documentation += generate_field_text(
   1119     name,
   1120     field_type,
   1121     model,
   1122     documentation_with_field_description=documentation_with_field_description,
   1123 )

File /environment/miniconda3/lib/python3.10/abc.py:123, in ABCMeta.__subclasscheck__(cls, subclass)
    121 def __subclasscheck__(cls, subclass):
    122     """Override for issubclass(subclass, cls)."""
--> 123     return _abc_subclasscheck(cls, subclass)

TypeError: issubclass() arg 1 must be a class

But when I use case in https://github.com/Maximilian-Winter/llama-cpp-agent/blob/master/examples/02_Structured_Output/book_dataset_creation.py

I get favorable output. Does this mean, you should update your readme ?

Maximilian-Winter commented 7 months ago

@svjack I can't reproduce the error. But could you send me your complete code that causes the error?

svjack commented 7 months ago

@svjack I can't reproduce the error. But could you send me your complete code that causes the error?

In Python 3.10.12

Install by

pip install llama-cpp-agent
pip install transformers
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir

source code

import llama_cpp
import llama_cpp.llama_tokenizer

llama = llama_cpp.Llama.from_pretrained(
    repo_id="Qwen/Qwen1.5-14B-Chat-GGUF",
    filename="*q4_0.gguf",
    tokenizer=llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained("Qwen/Qwen1.5-14B"),
    verbose=False,
    n_gpu_layers = -1,
    n_ctx = 3060
)

from enum import Enum

from llama_cpp import Llama
from pydantic import BaseModel, Field

from llama_cpp_agent.structured_output_agent import StructuredOutputAgent

# Example enum for our output model
class Category(Enum):
    Fiction = "Fiction"
    NonFiction = "Non-Fiction"

# Example output model
class Book(BaseModel):
    """
    Represents an entry about a book.
    """
    title: str = Field(..., description="Title of the book.")
    author: str = Field(..., description="Author of the book.")
    published_year: int = Field(..., description="Publishing year of the book.")
    keywords: list[str] = Field(..., description="A list of keywords.")
    category: Category = Field(..., description="Category of the book.")
    summary: str = Field(..., description="Summary of the book.")

structured_output_agent = StructuredOutputAgent(llama, debug_output=True)

text = """The Feynman Lectures on Physics is a physics textbook based on some lectures by Richard Feynman, a Nobel laureate who has sometimes been called "The Great Explainer". The lectures were presented before undergraduate students at the California Institute of Technology (Caltech), during 1961–1963. The book's co-authors are Feynman, Robert B. Leighton, and Matthew Sands."""
print(structured_output_agent.create_object(Book, text))

structured_output_agent.create_object yield the above error. 🤔

Maximilian-Winter commented 7 months ago

@svjack I tried your code and it worked for me, but it lead to an error in llama-cpp-python after creating the object. I have to investigate that.

GTJoey commented 6 months ago

When I run the structured output demo ,the same problem happened!! my anaconda env: python ==3.10.12

Package Version


aiohttp 3.9.5 aiosignal 1.3.1 annotated-types 0.6.0 async-timeout 4.0.3 attrs 23.2.0 build 1.2.1 CacheControl 0.14.0 certifi 2024.2.2 charset-normalizer 3.3.2 cleo 2.1.0 colorama 0.4.6 crashtest 0.4.1 diskcache 5.6.3 distlib 0.3.8 docstring_parser 0.16 dulwich 0.21.7 fastjsonschema 2.19.1 filelock 3.13.4 frozenlist 1.4.1 idna 3.7 importlib_metadata 7.1.0 installer 0.7.0 jaraco.classes 3.4.0 Jinja2 3.1.4 keyring 24.3.1 llama-cpp-agent 0.1.4 llama_cpp_python 0.2.70 MarkupSafe 2.1.5 more-itertools 10.2.0 msgpack 1.0.8 multidict 6.0.5 numpy 1.26.4 packaging 24.0 pexpect 4.9.0 pip 23.3.1 pkginfo 1.10.0 platformdirs 4.2.1 poetry 1.8.2 poetry-core 1.9.0 poetry-plugin-export 1.7.1 ptyprocess 0.7.0 pydantic 2.7.1 pydantic_core 2.18.2 pyproject_hooks 1.0.0 pywin32-ctypes 0.2.2 rapidfuzz 3.8.1 requests 2.31.0 requests-toolbelt 1.0.0 setuptools 68.2.2 shellingham 1.5.4 tomli 2.0.1 tomlkit 0.12.4 trove-classifiers 2024.4.10 typing_extensions 4.11.0 urllib3 2.2.1 virtualenv 20.26.0 wheel 0.41.2 yarl 1.9.4 zipp 3.18.1