[Bug]: Troubleshooting Issues in LongLLMlingua RAG Demo after Updating to Version 0.10

run-llama / llama_index

LlamaIndex is a data framework for your LLM applications

MIT License

36.21k stars 5.16k forks source link

# Install dependency. !pip install llmlingua llama-index llama-index-embeddings-huggingface llama-index-embeddings-instructor llama-index-llms-openai llama-index-llms-openai-like llama-index-readers-file pymupdf llama-index-retrievers-bm25 transformer !wget "https://www.dropbox.com/s/f6bmb19xdg0xedm/paul_graham_essay.txt?dl=1" -O paul_graham_essay.txt from llama_index.core import VectorStoreIndex, SimpleDirectoryReader from llama_index.embeddings.huggingface import HuggingFaceEmbedding from llama_index.core import VectorStoreIndex, SimpleDirectoryReader from llama_index.core import Settings from llama_index.llms.openai_like import OpenAILike # Setup LLMLingua from llama_index.core.query_engine import RetrieverQueryEngine from llama_index.core.response_synthesizers import CompactAndRefine from llama_index.legacy.postprocessor.longllmlingua import * from llama_index.core import QueryBundle from llama_index.llms.openai import OpenAI import os import openai # load documents documents = SimpleDirectoryReader(input_files=["paul_graham_essay.txt"]).load_data() Settings.embed_model = HuggingFaceEmbedding( model_name="BAAI/bge-small-en-v1.5" ) index = VectorStoreIndex.from_documents(documents) # question = "What did the author do growing up?" # question = "What did the author do during his time in YC?" question = "Where did the author go for art school?" retriever = index.as_retriever(similarity_top_k=10) retriever = index.as_retriever(similarity_top_k=10) # Ground-truth Answer answer = "RISD" contexts = retriever.retrieve(question) context_list = [n.get_content() for n in contexts] len(context_list) OPENAI_BASE_API = "https://openai like api/v1" OPENAI_API_KEY = "sk-your key" openai.api_base = "https://openai like ai/v1" openai.api_key = "sk-your key" llm = OpenAILike(model= "gpt-3.5-turbo-0125", api_base= "https://openai like ai/v1", api_key= "sk-your key", is_chat_model=True) llm2 = OpenAILike(model= "gpt-3.5-turbo-instruct", api_base= "https://openai like ai/v1", api_key= "sk-your key") prompt = "\n\n".join(context_list + [question]) response = llm.complete(prompt) print(str(response)) node_postprocessor = LongLLMLinguaPostprocessor( instruction_str="Given the context, please answer the final question", target_token=400, rank_method="longllmlingua", additional_compress_kwargs={ "condition_compare": True, "condition_in_question": "after", "context_budget": "+100", "reorder_context": "sort", # enable document reorder, "dynamic_context_compression_ratio": 0.3, }, ) Settings.llm = llm2 retrieved_nodes = retriever.retrieve(question) synthesizer = CompactAndRefine() # outline steps in RetrieverQueryEngine for clarity: # postprocess (compress), synthesize new_retrieved_nodes = node_postprocessor.postprocess_nodes( retrieved_nodes, query_bundle=QueryBundle(query_str=question) ) original_contexts = "\n\n".join([n.get_content() for n in retrieved_nodes]) compressed_contexts = "\n\n".join([n.get_content() for n in new_retrieved_nodes]) original_tokens = node_postprocessor._llm_lingua.get_token_length(original_contexts) compressed_tokens = node_postprocessor._llm_lingua.get_token_length(compressed_contexts) print(compressed_contexts) print() print("Original Tokens:", original_tokens) print("Compressed Tokens:", compressed_tokens) print("Compressed Ratio:", f"{original_tokens/(compressed_tokens + 1e-5):.2f}x")

response = synthesizer.synthesize(question, new_retrieved_nodes) print(str(response)) retriever_query_engine = RetrieverQueryEngine.from_args( retriever, node_postprocessors=[node_postprocessor] ) response = retriever_query_engine.query(question)

--------------------------------------------------------------------------- ValidationError Traceback (most recent call last) <ipython-input-9-beee427193a1> in <cell line: 5>() 3 ) 4 ----> 5 response = retriever_query_engine.query(question) 6 frames 6 帧 /usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py in __init__(__pydantic_self__, **data) 339 values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data) 340 if validation_error: --> 341 raise validation_error 342 try: 343 object_setattr(__pydantic_self__, '__dict__', values) ValidationError: 6 validation errors for SynthesizeEndEvent response -> source_nodes -> 0 -> node Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error) response -> source_nodes -> 1 -> node Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error) response -> source_nodes -> 2 -> node Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error) response instance of StreamingResponse, tuple or dict expected (type=type_error.dataclass; class_name=StreamingResponse) response instance of AsyncStreamingResponse, tuple or dict expected (type=type_error.dataclass; class_name=AsyncStreamingResponse) response instance of PydanticResponse, tuple or dict expected (type=type_error.dataclass; class_name=PydanticResponse)

The ValidationError you're encountering suggests that the BaseNode class is being instantiated directly, which is not allowed because it contains abstract methods that need to be implemented by a subclass. Additionally, the synthesize function and the query method in the RetrieverQueryEngine class now expect specific response types.

Here are the steps to resolve the issue:

Ensure Concrete Implementation of BaseNode: Make sure you are using a concrete subclass of BaseNode that implements all the required methods.
Update the synthesize Function: Ensure that the synthesize function returns the correct response types, such as NodeWithScore.
Update the RetrieverQueryEngine Usage: Ensure that the RetrieverQueryEngine is correctly instantiated and used.

Here is the updated code snippet with the necessary changes:

# Install dependency.
!pip install llmlingua llama-index llama-index-embeddings-huggingface llama-index-embeddings-instructor llama-index-llms-openai llama-index-llms-openai-like llama-index-readers-file pymupdf llama-index-retrievers-bm25 transformer

!wget "https://www.dropbox.com/s/f6bmb19xdg0xedm/paul_graham_essay.txt?dl=1" -O paul_graham_essay.txt

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core import Settings
from llama_index.llms.openai_like import OpenAILike
# Setup LLMLingua
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.response_synthesizers import CompactAndRefine
from llama_index.legacy.postprocessor.longllmlingua import *
from llama_index.core import QueryBundle
from llama_index.llms.openai import OpenAI
import os
import openai

# load documents
documents = SimpleDirectoryReader(input_files=["paul_graham_essay.txt"]).load_data()

Settings.embed_model = HuggingFaceEmbedding(
    model_name="BAAI/bge-small-en-v1.5"
)

index = VectorStoreIndex.from_documents(documents)

# question = "What did the author do growing up?"
# question = "What did the author do during his time in YC?"
question = "Where did the author go for art school?"

retriever = index.as_retriever(similarity_top_k=10)

retriever = index.as_retriever(similarity_top_k=10)

# Ground-truth Answer
answer = "RISD"

contexts = retriever.retrieve(question)

context_list = [n.get_content() for n in contexts]
len(context_list)
OPENAI_BASE_API = "https://openai like api/v1"
OPENAI_API_KEY = "sk-your key"

openai.api_base = "https://openai like ai/v1"
openai.api_key = "sk-your key"

llm = OpenAILike(model= "gpt-3.5-turbo-0125", api_base= "https://openai like ai/v1", api_key= "sk-your key", is_chat_model=True)
llm2 = OpenAILike(model= "gpt-3.5-turbo-instruct", api_base= "https://openai like ai/v1", api_key= "sk-your key")

prompt = "\n\n".join(context_list + [question])
response = llm.complete(prompt)
print(str(response))

node_postprocessor = LongLLMLinguaPostprocessor(
    instruction_str="Given the context, please answer the final question",
    target_token=400,
    rank_method="longllmlingua",
    additional_compress_kwargs={
        "condition_compare": True,
        "condition_in_question": "after",
        "context_budget": "+100",
        "reorder_context": "sort",  # enable document reorder,
        "dynamic_context_compression_ratio": 0.3,
    },
)
Settings.llm = llm2
retrieved_nodes = retriever.retrieve(question)
synthesizer = CompactAndRefine()

# outline steps in RetrieverQueryEngine for clarity:
# postprocess (compress), synthesize
new_retrieved_nodes = node_postprocessor.postprocess_nodes(
    retrieved_nodes, query_bundle=QueryBundle(query_str=question)
)

original_contexts = "\n\n".join([n.get_content() for n in retrieved_nodes])
compressed_contexts = "\n\n".join([n.get_content() for n in new_retrieved_nodes])

original_tokens = node_postprocessor._llm_lingua.get_token_length(original_contexts)
compressed_tokens = node_postprocessor._llm_lingua.get_token_length(compressed_contexts)

print(compressed_contexts)
print()
print("Original Tokens:", original_tokens)
print("Compressed Tokens:", compressed_tokens)
print("Compressed Ratio:", f"{original_tokens/(compressed_tokens + 1e-5):.2f}x")

# Ensure the synthesizer returns the correct type
from llama_index.core.schema import NodeWithScore, Node

# Example of creating a NodeWithScore
node = Node(text="example text")
node_with_score = NodeWithScore(node=node, score=1.0)

response = synthesizer.synthesize(question, [node_with_score])
print(str(response))

retriever_query_engine = RetrieverQueryEngine.from_args(
    retriever, node_postprocessors=[node_postprocessor]
)

query_bundle = QueryBundle(query_str=question)
response = retriever_query_engine.query(query_bundle)
print(str(response))

This should resolve the ValidationError by ensuring that the correct types are used and that the BaseNode class is not instantiated directly.

#### Details

**Sources** - [llama-index-core/llama_index/core/schema.py](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/schema.py) - [CHANGELOG.md](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [docs/docs/module_guides/querying/response_synthesizers/index.md](https://github.com/run-llama/llama_index/blob/main/docs/docs/module_guides/querying/response_synthesizers/index.md) - [llama-index-core/llama_index/core/query_engine/retriever_query_engine.py](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/query_engine/retriever_query_engine.py) - [llama-index-legacy/llama_index/legacy/query_engine/retriever_query_engine.py](https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/query_engine/retriever_query_engine.py) - [llama-index-core/llama_index/core/query_engine/router_query_engine.py](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/query_engine/router_query_engine.py)
**About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

run-llama / llama_index

[Bug]: Troubleshooting Issues in LongLLMlingua RAG Demo after Updating to Version 0.10 #13617

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks