60th example error with litellm LLM

If the llama3 from ollama is running on http://8.140.18.**:28275, the following code from 60th example runs fine.

from txtai.pipeline import LLM
llm = LLM("ollama/llama3", method="litellm", api_base="http://8.140.18.**:28275")
def rag(question, text):
    prompt = f"""### system
You are a friendly assistant. You answer questions from users.

### user
Answer the following question using only the context below. Only include information specifically discussed.
question: {question}
context: {text}

### assistant
"""
    return llm(prompt, maxlength=4096)

context = """
England's terrain chiefly consists of low hills and plains, especially in the centre and south.
The Battle of Hastings was fought on 14 October 1066 between the Norman army of William, the Duke of Normandy, and an English army under the Anglo-Saxon King Harold Godwinson
Bounded by the Atlantic Ocean on the east, Brazil has a coastline of 7,491 kilometers (4,655 mi).
Spain pioneered the exploration of the New World and the first circumnavigation of the globe.
Christopher Columbus lands in the Caribbean in 1492.
"""
print(rag("List the countries discussed", context))

However, when run the following code

from typing import List

from outlines.integrations.transformers import JSONPrefixAllowedTokens
from pydantic import BaseModel

class Response(BaseModel):
    answers: List[str]
    citations: List[str]

# Define method that guides LLM generation
prefix_allowed_tokens_fn=JSONPrefixAllowedTokens(
    schema=Response,
    tokenizer_or_pipe=llm.generator.llm.pipeline.tokenizer,
    whitespace_pattern=r" ?"
)

def rag(question, text):
    prompt = f"""### system
You are a friendly assistant. You answer questions from users.

### user
Answer the following question using only the context below. Only include information specifically discussed.
question: {question}
context: {text}

### assistant
"""
    return llm(prompt, maxlength=4096, prefix_allowed_tokens_fn=prefix_allowed_tokens_fn)

It shows error (caused by the line 'tokenizer_or_pipe=llm.generator.llm.pipeline.tokenizer,'):

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[11], [line 13](vscode-notebook-cell:?execution_count=11&line=13)
      [8](vscode-notebook-cell:?execution_count=11&line=8)     citations: List[str]
     [10](vscode-notebook-cell:?execution_count=11&line=10) # Define method that guides LLM generation
     [11](vscode-notebook-cell:?execution_count=11&line=11) prefix_allowed_tokens_fn=JSONPrefixAllowedTokens(
     [12](vscode-notebook-cell:?execution_count=11&line=12)     schema=Response,
---> [13](vscode-notebook-cell:?execution_count=11&line=13)     tokenizer_or_pipe=llm.generator.llm.pipeline.tokenizer,
     [14](vscode-notebook-cell:?execution_count=11&line=14)     whitespace_pattern=r" ?"
     [15](vscode-notebook-cell:?execution_count=11&line=15) )
     [17](vscode-notebook-cell:?execution_count=11&line=17) def rag(question, text):
     [18](vscode-notebook-cell:?execution_count=11&line=18)     prompt = f"""### system
     [19](vscode-notebook-cell:?execution_count=11&line=19) You are a friendly assistant. You answer questions from users.
     [20](vscode-notebook-cell:?execution_count=11&line=20) 
   (...)
     [26](vscode-notebook-cell:?execution_count=11&line=26) ### assistant
     [27](vscode-notebook-cell:?execution_count=11&line=27) """

AttributeError: 'LiteLLM' object has no attribute 'llm'

How to fix this issue?

Thanks

neuml / txtai

60th example error with litellm LLM #702