run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
37.13k stars 5.33k forks source link

[Question]: Can chat engine be enabled for graph index? #8047

Closed sumitsahoo closed 1 year ago

sumitsahoo commented 1 year ago

Question Validation

Question

I have gone through the document but my use case is to use a graph index. In short, I have multiple sources of information i.e. Database, File, and Directory. Hence I created a ListIndex for each and combine them to store it as a graph index. Now the issue is Graph index does not have chat_engine option i.e. index.as_chat_engine(...). It only has query_engine.

I need the chat_engine option because I want to retain history and query from history which acts like memory while is same is not available for query_engine. Example below:

stream_response = chat_engine.stream_chat(message, chat_history=message_history)

Here message_history is an array of ChatMessage.

Any recommendations, please? What is the best way to have index from different sources and use chat_engine to have a memory as well?

dosubot[bot] commented 1 year ago

🤖

Hello,

Thank you for your detailed question. Based on your use case, it seems like you want to use a chat_engine with a Graph Index in LlamaIndex, and you have multiple sources of information. You also want to retain history and query from history.

To achieve this, you can use the ContextChatEngine class provided in the llama_index.chat_engine.context module. This class uses a retriever to retrieve a context, sets the context in the system prompt, and then uses an LLM to generate a response, for a fluid chat experience.

Here is an example of how you can use the ContextChatEngine:

from llama_index.chat_engine.context import ContextChatEngine
from llama_index.indices.knowledge_graph.retrievers import KnowledgeGraphRAGRetriever
from llama_index.indices.knowledge_graph.base import KnowledgeGraphIndex
from llama_index.llms.base import LLM
from llama_index.memory import BaseMemory
from llama_index.llm_predictor.base import LLMPredictor
from llama_index.indices.service_context import ServiceContext

# Initialize your retriever, LLM, and memory
index = KnowledgeGraphIndex()
retriever = KnowledgeGraphRAGRetriever(index)
llm = LLM()
memory = BaseMemory()

# Initialize your service context
service_context = ServiceContext.from_defaults()

# Create an instance of ContextChatEngine
context_chat_engine = ContextChatEngine.from_defaults(
    retriever=retriever,
    service_context=service_context,
    memory=memory
)

# Use the chat method to interact with the chat engine
response = context_chat_engine.chat("Hello, how are you?")

In this example, we create an instance of ContextChatEngine with the desired retriever, LLM, and memory. We then use the chat method to interact with the chat engine. The chat method takes a message and an optional chat history as arguments, and returns an AgentChatResponse.

Please note that this is a simplified example and you would need to implement the KnowledgeGraphIndex, KnowledgeGraphRAGRetriever, LLM, and BaseMemory based on your specific requirements. The KnowledgeGraphRAGRetriever is used to retrieve information from a knowledge graph index, which is useful when you have multiple sources of information and need to retain history and query from history.

You can find more information about the ContextChatEngine class in the llama_index/chat_engine/context.py file in the LlamaIndex repository.

I hope this helps! If you have any further questions, please feel free to ask.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

logan-markewich commented 1 year ago

The bot is almost correct

You can create an agent (the default chat engine) with your graph query engine https://gpt-index.readthedocs.io/en/stable/examples/agent/openai_agent_with_query_engine.html

sumitsahoo commented 1 year ago

@logan-markewich Thanks a bunch. I will try this out. I just want to make sure that I am on right path.

sumitsahoo commented 1 year ago

@dosu-bot May I know if I should combine my multiple ListIndex and create a ComposableGraph then store it to disk or is there a better way? The goal is to combine multiple indexes then save them to disk then retrieve them to use for chat_engine.

Below is the snippet I used to save the index.

graph = ComposableGraph.from_indices(GPTTreeIndex, index_list, index_summaries=summary_list)
graph.root_index.set_index_id("my_id")
graph.root_index.storage_context.persist(persist_dir=os.environ["INDEX_DIR"])
dosubot[bot] commented 1 year ago

🤖

Hello,

Yes, your approach to combine multiple ListIndex into a ComposableGraph, save it to disk, and then retrieve it for use with a chat_engine in the LlamaIndex application is correct.

Your code does the following:

  1. It creates a ComposableGraph from the indices using the ComposableGraph.from_indices method.
  2. It sets an index id for the root index of the graph.
  3. It persists the graph to disk using the persist method of the storage_context of the root index.

This approach ensures that the graph is correctly built and saved to disk.

However, there are a few things you might want to consider:

  1. Make sure that index_list and summary_list have the same length. The from_indices method requires that each index in index_list has a corresponding summary in summary_list.

  2. Make sure that os.environ["INDEX_DIR"] points to a valid directory where you have write permissions. The persist method will try to save the graph to this directory.

  3. To load the graph from disk, you can use the load method of the ComposableGraph class. Here's a code snippet on how to do it:

from llama_index.indices.composable_graph import ComposableGraph

graph = ComposableGraph.load(os.environ["INDEX_DIR"])

This will load the graph from the directory specified by os.environ["INDEX_DIR"].

I hope this helps! If you have any other questions, feel free to ask.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

sumitsahoo commented 1 year ago

@dosu-bot ComposableGraph has no attribute load

dosubot[bot] commented 1 year ago

🤖

Hello,

The error you're encountering is because ComposableGraph does not have a load attribute. Instead, you should use the load_graph_from_storage function from the llama_index.indices.loading module. This function takes a StorageContext, the root_id of the graph, and a ServiceContext as arguments. The StorageContext should contain the docstore, index store, and vector store. The root_id is the ID of the root index of the graph. The ServiceContext is used to provide services like logging and metrics.

Here is an example of how you can use this function:

from llama_index.storage.storage_context import StorageContext
from llama_index.indices.loading import load_graph_from_storage
from llama_index.indices.service_context import ServiceContext

# Assuming you have a valid StorageContext instance
storage_context = StorageContext(...)

# The ID of the root index of the graph
root_id = "your_root_id"

# Assuming you have a valid ServiceContext instance
service_context = ServiceContext(...)

# Load the ComposableGraph
graph = load_graph_from_storage(storage_context, root_id, service_context)

You can find more details in the source code here: llama_index/indices/loading.py and in the tests here: tests/indices/test_loading_graph.py

If the file does not exist at the specified persist_path, the logger will output a warning message:

"No existing {__name__} found at {persist_path}. Initializing a new graph_store from scratch."

This message is found in the from_persist_path method of the SimpleGraphStore class. This method is used to create a SimpleGraphStore from a persist directory. If the file does not exist at the specified path, it logs the warning and initializes a new SimpleGraphStore from scratch.

You can find more details in the source code here: llama_index/graph_stores/simple.py

I hope this helps! If you have any other questions, feel free to ask.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

sumitsahoo commented 1 year ago

@dosu-bot Is it a good idea to store individual indexes in storage then retrieve then build a graph instead of saving graph in storage? I was having issues in query after retrieving the graph from storage but then the graph is built from indexes it works just fine.

Sample code to store indexes:

index1.storage_context.persist(persist_dir=os.environ["INDEX_DIR"] + "/index1")
index2.storage_context.persist(persist_dir=os.environ["INDEX_DIR"] + "/index2")
index3.storage_context.persist(persist_dir=os.environ["INDEX_DIR"] + "/index3")
dosubot[bot] commented 1 year ago

🤖

Hello,

Based on the issues you've described and the context provided, it seems like you're on the right track with your approach. Storing individual indexes and then retrieving them to build a ComposableGraph is indeed a valid approach. However, there's a more efficient way to handle this in the LlamaIndex framework.

Instead of storing individual indexes and then retrieving them to build a graph, you can directly load the ComposableGraph from the storage using the load_graph_from_storage function. This function takes a storage_context and a root_id as arguments. The storage_context contains the docstore, index store, and vector store. The root_id is the ID of the root index of the graph. The function loads all indices from the storage using the load_indices_from_storage function and then creates a ComposableGraph using these indices.

Here's the function for reference:

def load_graph_from_storage(
    storage_context: StorageContext,
    root_id: str,
    **kwargs: Any,
) -> ComposableGraph:
    """Load composable graph from storage context.

    Args:
        storage_context (StorageContext): storage context containing
            docstore, index store and vector store.
        root_id (str): ID of the root index of the graph.
        **kwargs: Additional keyword args to pass to the index constructors.
    """
    indices = load_indices_from_storage(storage_context, index_ids=None, **kwargs)
    all_indices = {index.index_id: index for index in indices}
    return ComposableGraph(all_indices=all_indices, root_id=root_id)

You can find this in the llama_index/indices/loading.py file.

However, without the exact error message or issue you're encountering when querying after retrieving the graph from storage, it's hard to provide a more specific solution. Could you please provide more details about the error you're encountering? This will help in diagnosing the issue more accurately.

I hope this helps! Let me know if you have any other questions.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

sumitsahoo commented 1 year ago

@dosu-bot How do we configure KnowledgeGraphRAGRetriever from a ComposableGraph index?

dosubot[bot] commented 1 year ago

🤖

Hello,

To configure a KnowledgeGraphRAGRetriever from a ComposableGraph index in the LlamaIndex framework for a chat_engine, you would need to initialize an instance of the KnowledgeGraphRAGRetriever class with the ComposableGraph index as an argument.

Here is an example of how you might initialize it:

from llama_index.indices.knowledge_graph.retrievers import KnowledgeGraphRAGRetriever
from llama_index.service_context import ServiceContext
from llama_index.storage_context import StorageContext
from llama_index.indices.composability.graph import ComposableGraph

# Assuming `all_indices` is your dictionary of BaseIndex instances and `root_id` is the ID of the root index
composable_graph_index = ComposableGraph(all_indices=all_indices, root_id=root_id)

retriever = KnowledgeGraphRAGRetriever(
    service_context=ServiceContext.from_defaults(),
    storage_context=StorageContext(graph_store=composable_graph_index),
    # Other necessary arguments here...
)

The necessary arguments for initializing the KnowledgeGraphRAGRetriever class in the LlamaIndex framework are:

  1. service_context (Optional[ServiceContext]): A service context to use.
  2. storage_context (Optional[StorageContext]): A storage context to use.
  3. entity_extract_fn (Optional[Callable]): A function to extract entities.
  4. entity_extract_template (Optional[BasePromptTemplate]): A Query Key Entity Extraction Prompt.
  5. entity_extract_policy (Optional[str]): The entity extraction policy to use. Default is "union". Possible values are "union", "intersection".
  6. synonym_expand_fn (Optional[Callable]): A function to expand synonyms.
  7. synonym_expand_template (Optional[QueryKeywordExpandPrompt]): A Query Key Entity Expansion Prompt.
  8. synonym_expand_policy (Optional[str]): The synonym expansion policy to use. Default is "union". Possible values are "union", "intersection".
  9. max_entities (int): The maximum number of entities to extract. Default is 5.
  10. max_synonyms (int): The maximum number of synonyms to expand per entity. Default is 5.
  11. retriever_mode (Optional[str]): The retriever mode to use. Default is "keyword". Possible values are "keyword", "embedding", "keyword_embedding".
  12. with_nl2graphquery (bool): Whether to combine NL2GraphQuery in context. Default is False.
  13. graph_traversal_depth (int): The depth of graph traversal. Default is 2.
  14. max_knowledge_sequence (int): The maximum number of knowledge sequence to include in the response. By default, it's 30.
  15. verbose (bool): Whether to print out debug info.

Please note that you need to provide the appropriate arguments for the KnowledgeGraphRAGRetriever class based on your specific use case.

You can find more details in the source code of the KnowledgeGraphRAGRetriever class in the llama_index/indices/knowledge_graph/retrievers.py file and the ComposableGraph class in the llama_index/indices/composability/graph.py file.

I hope this helps! Let me know if you have any other questions.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

sumitsahoo commented 1 year ago

@dosu-bot Getting below error and unable to instantiate memory. Below is the error.

memory = BaseMemory()
         ^^^^^^^^^^^^
TypeError: Can't instantiate abstract class BaseMemory with abstract methods from_defaults, get, get_all, put, reset, set
dosubot[bot] commented 1 year ago

🤖

Hello,

The BaseMemory class in the LlamaIndex framework is an abstract base class that defines the interface for all memory types. It's not meant to be instantiated directly, hence the TypeError you're encountering.

Instead, you should use a concrete class that implements the BaseMemory interface. In the LlamaIndex framework, one such class is ChatMemoryBuffer. Here's how you can instantiate a ChatMemoryBuffer object:

from llama_index.memory import ChatMemoryBuffer

memory = ChatMemoryBuffer()

The ChatMemoryBuffer class is a concrete implementation of BaseMemory and provides methods for managing chat history in LLM applications. It includes methods for creating a chat memory from defaults, getting chat history, getting all chat history, putting chat history, setting chat history, and resetting chat history.

You can find more details about the ChatMemoryBuffer class and its methods in the llama_index/memory/chat_memory_buffer.py file in the LlamaIndex repository.

I hope this helps! Let me know if you have any other questions.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

sumitsahoo commented 1 year ago

@dosu-bot Sorry, I am still a bit confused. Consider I have a ComposableGraph and how exactly I can create a chat_engine from this ComposableGraph object.

graph = ComposableGraph.from_indices(GPTTreeIndex, index_list, index_summaries=summary_list)

Also, what is the difference between ComposableGraph and KnowledgeGraphIndex?

dosubot[bot] commented 1 year ago

🤖

Hello,

To create a chat_engine from a ComposableGraph object in the LlamaIndex framework, you can use the as_query_engine method of the ComposableGraph class. Here is how you can do it:

from llama_index.indices.composability.graph import ComposableGraph

# Assuming you have a ComposableGraph object named `graph`
chat_engine = graph.as_query_engine()

In the context shared, the as_query_engine method of the ComposableGraph class is used to create a ComposableGraphQueryEngine instance. This method also accepts additional arguments to be passed to the ComposableGraphQueryEngine.

Regarding your question about the difference between ComposableGraph and KnowledgeGraphIndex, the ComposableGraph is a class that represents a graph of indices that can be composed together. It provides methods to get an index from an index struct id and to create a query engine from the graph.

On the other hand, KnowledgeGraphIndex is a specific type of index that is used to store and retrieve information in a knowledge graph format. It is used to build a Knowledge Graph from given documents with the help of Llama Index and LLM defined. It can be created from documents and can include embeddings. Once created, it can be queried using the KnowledgeGraphQueryEngine which can generate queries in natural language.

Here is an example of how to create a KnowledgeGraphIndex:

kg_index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    max_triplets_per_chunk=10,
    service_context=service_context,
    space_name=space_name,
    edge_types=edge_types,
    rel_prop_names=rel_prop_names,
    tags=tags,
    include_embeddings=True,
)

And here is how to query it using the KnowledgeGraphQueryEngine:

query_engine = KnowledgeGraphQueryEngine(
    storage_context=storage_context,
    service_context=service_context,
    llm=llm,
    verbose=True,
)

response = query_engine.query(
    "Tell me about Peter Quill?",
)

Please note that the exact differences would depend on the specific implementation of these classes in the LlamaIndex framework.

I hope this helps! If you have any other questions, feel free to ask.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

sumitsahoo commented 1 year ago

🤖

Hello,

To create a chat_engine from a ComposableGraph object in the LlamaIndex framework, you can use the as_query_engine method of the ComposableGraph class. Here is how you can do it:

from llama_index.indices.composability.graph import ComposableGraph

# Assuming you have a ComposableGraph object named `graph`
chat_engine = graph.as_query_engine()

In the context shared, the as_query_engine method of the ComposableGraph class is used to create a ComposableGraphQueryEngine instance. This method also accepts additional arguments to be passed to the ComposableGraphQueryEngine.

Regarding your question about the difference between ComposableGraph and KnowledgeGraphIndex, the ComposableGraph is a class that represents a graph of indices that can be composed together. It provides methods to get an index from an index struct id and to create a query engine from the graph.

On the other hand, KnowledgeGraphIndex is a specific type of index that is used to store and retrieve information in a knowledge graph format. It is used to build a Knowledge Graph from given documents with the help of Llama Index and LLM defined. It can be created from documents and can include embeddings. Once created, it can be queried using the KnowledgeGraphQueryEngine which can generate queries in natural language.

Here is an example of how to create a KnowledgeGraphIndex:

kg_index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    max_triplets_per_chunk=10,
    service_context=service_context,
    space_name=space_name,
    edge_types=edge_types,
    rel_prop_names=rel_prop_names,
    tags=tags,
    include_embeddings=True,
)

And here is how to query it using the KnowledgeGraphQueryEngine:

query_engine = KnowledgeGraphQueryEngine(
    storage_context=storage_context,
    service_context=service_context,
    llm=llm,
    verbose=True,
)

response = query_engine.query(
    "Tell me about Peter Quill?",
)

Please note that the exact differences would depend on the specific implementation of these classes in the LlamaIndex framework.

I hope this helps! If you have any other questions, feel free to ask.

Best, Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@dosu-bot But in above example you have created a query_engine which does not support chat_history. How can I create a chat_engine instead of query_ingine from ComposableGraph? chat_engine also has a nice method stream_chat using which we can stream the tokens.

dosubot[bot] commented 1 year ago

🤖

Hello,

To create a chat_engine from a ComposableGraph object in the LlamaIndex framework, you can use the ComposableGraphQueryEngine class. Here's a simple example:

from llama_index.indices.composability.graph import ComposableGraph
from llama_index.query_engine.graph_query_engine import ComposableGraphQueryEngine

# Assuming you have a ComposableGraph object named 'graph'
chat_engine = ComposableGraphQueryEngine(graph)

In this example, graph is a ComposableGraph object that you want to use to create the chat engine. The ComposableGraphQueryEngine class takes this ComposableGraph object as an argument and creates a query engine that can operate over it.

The ComposableGraph and KnowledgeGraphIndex are both used in the LlamaIndex framework but serve different purposes. The ComposableGraph is a type of graph that can be used with the ComposableGraphQueryEngine to create a chat engine. This chat engine can take in custom query engines for its sub-indices and can optionally recursively query the graph.

On the other hand, KnowledgeGraphIndex is used in the context of knowledge graph retrievers. It is used to initialize a KGTableRetriever object, which is a retriever for knowledge graphs. The KnowledgeGraphIndex is used to search for nodes by keywords, retrieve nodes for responses, and extract keywords from queries. It also supports different retriever modes such as keyword, embedding, and hybrid modes. The KnowledgeGraphIndex is also used to create a KnowledgeGraphQueryEngine if the with_nl2graphquery parameter is set to True. This query engine can be used to perform queries on the knowledge graph.

The as_query_engine method in the ComposableGraph class in the LlamaIndex framework is used to convert the ComposableGraph instance into a ComposableGraphQueryEngine instance. This method is useful when you want to perform queries on the ComposableGraph instance. It does this by importing the ComposableGraphQueryEngine from llama_index.query_engine.graph_query_engine and returning an instance of it, passing the current ComposableGraph instance (self) and any additional keyword arguments (**kwargs) to the ComposableGraphQueryEngine constructor.

Here is the code for the as_query_engine method:

def as_query_engine(self, **kwargs: Any) -> BaseQueryEngine:
    # NOTE: lazy import
    from llama_index.query_engine.graph_query_engine import (
        ComposableGraphQueryEngine,
    )

    return ComposableGraphQueryEngine(self, **kwargs)

This method is a good example of the Factory Method design pattern, where a method in a class is used to create objects.

I hope this helps! Let me know if you have any other questions.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

sumitsahoo commented 1 year ago

@dosu-bot I was actually trying to create a ContextChatEngine from ComposableGraph and have a chat_engine to do stream chat which could also support chat_history along with streaming response.

dosubot[bot] commented 1 year ago

🤖

Hello,

To create a ContextChatEngine from a ComposableGraph object in the LlamaIndex framework that supports chat_history and streaming responses, you can follow these steps:

  1. Load your data and build your index:
from llama_index import VectorStoreIndex, SimpleDirectoryReader

data = SimpleDirectoryReader(input_dir="your_data_directory").load_data()
index = VectorStoreIndex.from_documents(data)
  1. Configure your chat engine with a memory buffer to limit the chat history:
from llama_index.memory import ChatMemoryBuffer
from llama_index.chat_engine.context import ContextChatEngine

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)

chat_engine = ContextChatEngine.from_defaults(
    service_context=service_context,
    memory=memory,
    system_prompt="Your system prompt here.",
)
  1. For streaming support, set up a service context and use the stream_chat method:
from llama_index import (
    ServiceContext,
    set_global_service_context,
)
from llama_index.llms import OpenAI

service_context = ServiceContext.from_defaults(
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0)
)
set_global_service_context(service_context)

response = chat_engine.stream_chat("Your question here")
for token in response.response_gen:
    print(token, end="")

Please replace "your_data_directory" with the path to your data, "Your system prompt here." with your system prompt, and "Your question here" with your question.

However, based on the provided context, it's not clear how the ComposableGraph object interacts with the ContextChatEngine. The ComposableGraph object is used as a data structure in the ComposableGraphQueryEngine class, which operates over a ComposableGraph and can take in custom query engines for its sub-indices. But there's no mention of any interaction between the ComposableGraph object and a ContextChatEngine.

If you need more help or if there's something I misunderstood, please provide more details.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

sumitsahoo commented 1 year ago

🤖

Hello,

To create a ContextChatEngine from a ComposableGraph object in the LlamaIndex framework that supports chat_history and streaming responses, you can follow these steps:

  1. Load your data and build your index:
from llama_index import VectorStoreIndex, SimpleDirectoryReader

data = SimpleDirectoryReader(input_dir="your_data_directory").load_data()
index = VectorStoreIndex.from_documents(data)
  1. Configure your chat engine with a memory buffer to limit the chat history:
from llama_index.memory import ChatMemoryBuffer
from llama_index.chat_engine.context import ContextChatEngine

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)

chat_engine = ContextChatEngine.from_defaults(
    service_context=service_context,
    memory=memory,
    system_prompt="Your system prompt here.",
)
  1. For streaming support, set up a service context and use the stream_chat method:
from llama_index import (
    ServiceContext,
    set_global_service_context,
)
from llama_index.llms import OpenAI

service_context = ServiceContext.from_defaults(
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0)
)
set_global_service_context(service_context)

response = chat_engine.stream_chat("Your question here")
for token in response.response_gen:
    print(token, end="")

Please replace "your_data_directory" with the path to your data, "Your system prompt here." with your system prompt, and "Your question here" with your question.

However, based on the provided context, it's not clear how the ComposableGraph object interacts with the ContextChatEngine. The ComposableGraph object is used as a data structure in the ComposableGraphQueryEngine class, which operates over a ComposableGraph and can take in custom query engines for its sub-indices. But there's no mention of any interaction between the ComposableGraph object and a ContextChatEngine.

If you need more help or if there's something I misunderstood, please provide more details.

Best, Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@dosu-bot Sorry I do not see ComposableGraph used in the example. The question was, how do I get a ContextChatEngine using ComposableGraph.

dosubot[bot] commented 1 year ago

🤖

Hello,

To convert your ComposableGraph object into a ContextChatEngine that supports chat_history and streaming responses, you can use the as_chat_engine method of the VectorStoreIndex class, which is a subclass of ComposableGraph.

Here's how you can do it:

First, load your data and build an index:

from llama_index import VectorStoreIndex, SimpleDirectoryReader

data = SimpleDirectoryReader(input_dir="../data/paul_graham/").load_data()
index = VectorStoreIndex.from_documents(data)

Then, configure the chat engine:

from llama_index.memory import ChatMemoryBuffer

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)

chat_engine = index.as_chat_engine(
    chat_mode="context",
    memory=memory,
    system_prompt="You are a chatbot, able to have normal interactions, as well as talk about an essay discussing Paul Grahams life.",
)

The as_chat_engine method creates a ContextChatEngine instance with the specified parameters. The chat_mode parameter should be set to "context" to enable context mode. The memory parameter is an instance of ChatMemoryBuffer that limits the token count of the chat history to 1500. The system_prompt parameter is a string that sets the system prompt.

The ContextChatEngine instance supports chat_history through its chat_history property and streaming responses through its stream_chat method. For example, you can stream a chat as follows:

response = chat_engine.stream_chat("What did Paul Graham do after YC?")
for token in response.response_gen:
    print(token, end="")

Please note that in the context shared, the ComposableGraph object is represented by the VectorStoreIndex instance, and the ContextChatEngine instance is created using the as_chat_engine method.

You can find more details in the llama_index/chat_engine/context.py file and the chat_engine_context.ipynb example notebook.

Let me know if you have any other questions or need further clarification.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

sumitsahoo commented 1 year ago

@dosu-bot I am curious if there is a way to add memory to BaseQueryEngine?

dosubot[bot] commented 1 year ago

🤖

Hello,

To add memory to the BaseQueryEngine, you can follow the solution suggested in the issue Possible to add memory to index.query?. This involves using chat engines and data agents. You can find more information about these in the following documentation:

As for the current structure of the BaseQueryEngine, it is extended by the CitationQueryEngine class. This class includes methods and properties such as __init__, from_args, _create_citation_nodes, retrieve, retriever, synthesize, asynthesize, _query, and _aquery. You can find the detailed structure in the source code of the LlamaIndex repository.

Regarding the purpose of adding memory to the BaseQueryEngine, I couldn't find a specific answer within the repository. However, generally, adding memory to a query engine can help to store intermediate results, improve performance, and enable more complex queries.

Please let me know if you need further assistance or clarification.

Best regards, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.