neuml / txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
https://neuml.github.io/txtai
Apache License 2.0
7.6k stars 532 forks source link

60th example error with litellm LLM #702

Open yiouyou opened 2 months ago

yiouyou commented 2 months ago

If the llama3 from ollama is running on http://8.140.18.**:28275, the following code from 60th example runs fine.

from txtai.pipeline import LLM
llm = LLM("ollama/llama3", method="litellm", api_base="http://8.140.18.**:28275")
def rag(question, text):
    prompt = f"""### system
You are a friendly assistant. You answer questions from users.

### user
Answer the following question using only the context below. Only include information specifically discussed.
question: {question}
context: {text}

### assistant
"""
    return llm(prompt, maxlength=4096)

context = """
England's terrain chiefly consists of low hills and plains, especially in the centre and south.
The Battle of Hastings was fought on 14 October 1066 between the Norman army of William, the Duke of Normandy, and an English army under the Anglo-Saxon King Harold Godwinson
Bounded by the Atlantic Ocean on the east, Brazil has a coastline of 7,491 kilometers (4,655 mi).
Spain pioneered the exploration of the New World and the first circumnavigation of the globe.
Christopher Columbus lands in the Caribbean in 1492.
"""
print(rag("List the countries discussed", context))

However, when run the following code

from typing import List

from outlines.integrations.transformers import JSONPrefixAllowedTokens
from pydantic import BaseModel

class Response(BaseModel):
    answers: List[str]
    citations: List[str]

# Define method that guides LLM generation
prefix_allowed_tokens_fn=JSONPrefixAllowedTokens(
    schema=Response,
    tokenizer_or_pipe=llm.generator.llm.pipeline.tokenizer,
    whitespace_pattern=r" ?"
)

def rag(question, text):
    prompt = f"""### system
You are a friendly assistant. You answer questions from users.

### user
Answer the following question using only the context below. Only include information specifically discussed.
question: {question}
context: {text}

### assistant
"""
    return llm(prompt, maxlength=4096, prefix_allowed_tokens_fn=prefix_allowed_tokens_fn)

It shows error (caused by the line 'tokenizer_or_pipe=llm.generator.llm.pipeline.tokenizer,'):

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[11], [line 13](vscode-notebook-cell:?execution_count=11&line=13)
      [8](vscode-notebook-cell:?execution_count=11&line=8)     citations: List[str]
     [10](vscode-notebook-cell:?execution_count=11&line=10) # Define method that guides LLM generation
     [11](vscode-notebook-cell:?execution_count=11&line=11) prefix_allowed_tokens_fn=JSONPrefixAllowedTokens(
     [12](vscode-notebook-cell:?execution_count=11&line=12)     schema=Response,
---> [13](vscode-notebook-cell:?execution_count=11&line=13)     tokenizer_or_pipe=llm.generator.llm.pipeline.tokenizer,
     [14](vscode-notebook-cell:?execution_count=11&line=14)     whitespace_pattern=r" ?"
     [15](vscode-notebook-cell:?execution_count=11&line=15) )
     [17](vscode-notebook-cell:?execution_count=11&line=17) def rag(question, text):
     [18](vscode-notebook-cell:?execution_count=11&line=18)     prompt = f"""### system
     [19](vscode-notebook-cell:?execution_count=11&line=19) You are a friendly assistant. You answer questions from users.
     [20](vscode-notebook-cell:?execution_count=11&line=20) 
   (...)
     [26](vscode-notebook-cell:?execution_count=11&line=26) ### assistant
     [27](vscode-notebook-cell:?execution_count=11&line=27) """

AttributeError: 'LiteLLM' object has no attribute 'llm'

How to fix this issue?

Thanks

davidmezzetti commented 2 months ago

I would have to research this more and review the outlines code. The outlines integration in that notebook only works with Transformers-based models.