Integrating `create_citation_fuzzy_match_chain` with `ConversationalRetrievalChain`

timxieICN commented 11 months ago

Feature request

Similar to the discussion in this thread https://github.com/langchain-ai/langchain/issues/7239, I would like to have a way to show inline citations in QA tasks.

Specially, I haven't had a whole lot of success integrating create_citation_fuzzy_match_chain with ConversationalRetrievalChain

Motivation

The point of this change is to enable QA based chains (e.g. ConversationalRetrievalChain) to easily do inline citations, and surface the citations as part of metadata fields

Your contribution

TBD

dosubot[bot] commented 11 months ago

🤖

Hello,

Thank you for your interest in integrating create_citation_fuzzy_match_chain with ConversationalRetrievalChain in the LangChain framework. Here's a step-by-step guide on how you can achieve this:

Import the necessary modules and classes.

from langchain.chains import create_citation_fuzzy_match_chain, ConversationalRetrievalChain
from langchain.chat_models import ChatOpenAI

Initialize the language model and the chains.

llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613")
citation_chain = create_citation_fuzzy_match_chain(llm)
conversational_chain = ConversationalRetrievalChain(llm)

Run the chains with the question and context.

citation_result = citation_chain.run(question=question, context=context)
conversational_result = conversational_chain.run(question=question, context=context)

Print the results.

print(citation_result)
print(conversational_result)

Highlight the citations.

def highlight(text, span):
    return (
        "..."
        + text[span[0] - 20 : span[0]]
        + "*"
        + "\033[91m"
        + text[span[0] : span[1]]
        + "\033[0m"
        + "*"
        + text[span[1] : span[1] + 20]
        + "..."
    )

for fact in citation_result.answer:
    print("Statement:", fact.fact)
    for span in fact.get_spans(context):
        print("Citation:", highlight(context, span))
    print()

Please note that the ConversationalRetrievalChain class is not mentioned in the provided context. You may need to refer to the LangChain documentation or source code to understand how to use it properly.

As for the create_citation_fuzzy_match_chain method, it is used to create a citation fuzzy match chain. This chain is used to answer questions with citations. The method takes a language model (llm) as an argument and returns a chain (LLMChain) that can be used to answer questions with citations. You can find more details about this method in the LangChain source code at libs/langchain/langchain/chains/openai_functions/citation_fuzzy_match.py.

I hope this helps! If you have any further questions or run into any issues, please don't hesitate to ask.

Best, Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

timxieICN commented 11 months ago

@dosu-bot

I would say that's a poor response. In my opinion, the proper way is to somehow replace the load_qa_chain of ConversationalRetrievalChain with create_citation_fuzzy_match_chain

dosubot[bot] commented 11 months ago

🤖

Hello,

To integrate create_citation_fuzzy_match_chain with ConversationalRetrievalChain, you would need to replace the load_qa_chain method with create_citation_fuzzy_match_chain. Here's a general guide on how you might do this:

from langchain.chains.openai_functions.citation_fuzzy_match import create_citation_fuzzy_match_chain

class BaseQAWithSourcesChain(Chain, ABC):
    def __init__(self, llm):
        self.llm = llm
        self.qa_chain = create_citation_fuzzy_match_chain(self.llm)

This assumes that BaseQAWithSourcesChain has an attribute qa_chain that was previously set by load_qa_chain. This code replaces that with a chain created by create_citation_fuzzy_match_chain.

Please note that this is a general guide and the actual implementation may vary depending on the structure of BaseQAWithSourcesChain and load_qa_chain.

As for the structure and functionality of the load_qa_chain method in the ConversationalRetrievalChain class, it is used in the from_llm class method of the ConversationalRetrievalChain class and the ChatVectorDBChain class to create the combine_docs_chain. However, without the actual implementation of the load_qa_chain method, it's hard to provide more details about its structure and functionality.

I hope this helps! If you have any more questions or need further clarification, feel free to ask.

Best, Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

timxieICN commented 11 months ago

@dosu-bot

ok, let's start simple - do you have any suggestions on how to create a StuffDocumentsCitationChain by combing the custom logic of create_citation_fuzzy_match_chain with the existing StuffDocumentsChain?

dosubot[bot] commented 11 months ago

🤖

To create a StuffDocumentsCitationChain by combining the custom logic of create_citation_fuzzy_match_chain with the existing StuffDocumentsChain, you would need to create a new class that inherits from StuffDocumentsChain and overrides the necessary methods to incorporate the logic from create_citation_fuzzy_match_chain.

Here's a rough idea of how you might do this:

from langchain.chains import StuffDocumentsChain, LLMChain
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessage, HumanMessagePromptTemplate, SystemMessage
from langchain.llms import OpenAI, BaseLanguageModel
from langchain.output_parsers import PydanticOutputFunctionsParser
from langchain.schemas import QuestionAnswer
from langchain.chains.openai_functions.utils import get_llm_kwargs

class StuffDocumentsCitationChain(StuffDocumentsChain):
    def __init__(self, llm: BaseLanguageModel, document_prompt: PromptTemplate, document_variable_name: str, document_separator: str = "\n"):
        output_parser = PydanticOutputFunctionsParser(pydantic_schema=QuestionAnswer)
        schema = QuestionAnswer.schema()
        function = {
            "name": schema["title"],
            "description": schema["description"],
            "parameters": schema,
        }
        llm_kwargs = get_llm_kwargs(function)
        messages = [
            SystemMessage(
                content=(
                    "You are a world class algorithm to answer "
                    "questions with correct and exact citations."
                )
            ),
            HumanMessage(content="Answer question using the following context"),
            HumanMessagePromptTemplate.from_template("{context}"),
            HumanMessagePromptTemplate.from_template("Question: {question}"),
            HumanMessage(
                content=(
                    "Tips: Make sure to cite your sources, "
                    "and use the exact words from the context."
                )
            ),
        ]
        prompt = ChatPromptTemplate(messages=messages)

        llm_chain = LLMChain(
            llm=llm,
            prompt=prompt,
            llm_kwargs=llm_kwargs,
            output_parser=output_parser,
        )

        super().__init__(
            llm_chain=llm_chain,
            document_prompt=document_prompt,
            document_variable_name=document_variable_name,
            document_separator=document_separator
        )

In this code, we're creating a new class StuffDocumentsCitationChain that inherits from StuffDocumentsChain. In the constructor, we're setting up the llm_chain to use the logic from create_citation_fuzzy_match_chain, and then passing that to the super().__init__ method along with the other necessary arguments.

Please note that this is a rough sketch and you might need to adjust it to fit your specific needs. Also, remember to import any necessary modules that might be missing from the code snippet above.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

timxieICN commented 11 months ago

@dosu-bot

I tried something similar, but now seeing error

llm_chain_variables = values["llm_chain"].prompt.input_variables
AttributeError: 'function' object has no attribute 'prompt'

dosubot[bot] commented 11 months ago

🤖

Hello,

The error you're encountering, AttributeError: 'function' object has no attribute 'prompt', is likely due to trying to access the 'prompt' attribute on a function object, which doesn't exist.

In your code, the StuffDocumentsChain class has an attribute llm_chain which is an instance of LLMChain. The LLMChain instance should have a 'prompt' attribute.

If you're trying to access the 'prompt' attribute from the llm_chain attribute of a StuffDocumentsChain instance, make sure you're not calling llm_chain as a function.

For example, if stuff_documents_chain is an instance of StuffDocumentsChain, you should access the 'prompt' attribute like this: stuff_documents_chain.llm_chain.prompt, not like this: stuff_documents_chain.llm_chain().prompt.

If you're still encountering the error, please ensure that the LLMChain instance has been correctly initialized with a 'prompt' attribute.

The LLMChain class in the LangChain codebase is a subclass of the Chain class. It is designed to run queries against Language Models (LLMs). The structure of the LLMChain class includes several attributes and methods. One of the attributes is indeed 'prompt'. The 'prompt' attribute is of the type BasePromptTemplate and is used to store the prompt object that the class uses.

Here is the line of code that defines the 'prompt' attribute:

prompt: BasePromptTemplate

This attribute is used in several methods within the LLMChain class, such as prep_prompts, aprep_prompts, and others, where it is used to format and prepare the prompts for the language model.

Please check these points and let me know if you're still encountering the issue.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

timxieICN commented 11 months ago

@dosu-bot

A related question: If I use ConversationalRetrievalChain.llm(memory=ConversationTokenBufferMemory()), is there a way for me to save extra information into additional_kwargs of the data field, other than content field?

dosubot[bot] commented 11 months ago

🤖

Hello,

To save extra information into the additional_kwargs of the data field, you can create a subclass of the ConversationBufferMemory class and override the save_context method. Here's an example:

from langchain.memory import ConversationBufferMemory

class CustomConversationBufferMemory(ConversationBufferMemory):
    def save_context(self, chain_input, chain_output):
        super().save_context(chain_input, chain_output)
        self.additional_kwargs = {"extra_info": "your additional information"}

memory = CustomConversationBufferMemory()

In this example, CustomConversationBufferMemory is a subclass of ConversationBufferMemory that overrides the save_context method. After calling the parent class's save_context method, it adds extra information to additional_kwargs.

You can then use this custom memory class with ConversationalRetrievalChain:

from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma

llm = ChatOpenAI()
retriever = vectorstore.as_retriever()
memory = CustomConversationBufferMemory(memory_key="chat_history",return_messages=True)
qa = ConversationalRetrievalChain.from_llm(llm, retriever=retriever, memory=memory)

Please replace "your additional information" with the actual information you want to save. In the context shared, it's important to note that the memory_key parameter should align with the variable name in your prompt template, and if you're using a ChatModel, you should set return_messages=True to return a list of messages.

You can find more information about this in the chatbots use case and the memory module documentation.

As for the structure of the data field in the ConversationalRetrievalChain.llm(memory=ConversationTokenBufferMemory()) class, it's not explicitly defined in the provided context. However, it can be inferred that the data field might be related to the input data that the ConversationalRetrievalChain class uses to generate responses. The ConversationalRetrievalChain class uses an instance of the InputType class as its input data. The InputType class has two fields: question and chat_history.

I hope this helps! Let me know if you have any other questions.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

timxieICN commented 11 months ago

@dosu-bot

Is there a way of not having to create a subclass such as CustomConversationBufferMemory, but able to adjust the output_key to distinguish between a content and additional_kwargs before calling save_context?

dosubot[bot] commented 11 months ago

🤖

Hello,

From your question, I understand that you want to save extra information into the additional_kwargs of the data field in the ConversationBufferMemory class without creating a subclass like CustomConversationBufferMemory. You also want to adjust the output_key to distinguish between a content and additional_kwargs before calling the save_context method.

Currently, the ConversationBufferMemory class does not support saving additional information directly into the additional_kwargs of the data field. The save_context method, as you can see in the BaseChatMemory class, only accepts inputs and outputs parameters and does not provide a way to distinguish between content and additional_kwargs.

However, you can achieve this by modifying the inputs or outputs dictionary before calling the save_context method. You can add a new key-value pair to the dictionary where the key is your output_key and the value is the additional information you want to save. Here is an example:

inputs = {
    'content': 'Your content here',
    'additional_kwargs': 'Your additional information here'
}
memory.save_context(inputs, outputs)

In this case, the save_context method will save both the content and additional_kwargs into the memory. You can then retrieve this information later by accessing the inputs or outputs dictionary.

Please note that this is a workaround and might not be the best solution depending on your specific use case. If you need more advanced functionality, you might need to consider creating a subclass or modifying the ConversationBufferMemory class.

I hope this helps! If you have any other questions, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

timxieICN commented 11 months ago

@dosu-bot:

Switching gears - If I use ConversationalRetrievalChain.llm(memory=ConversationTokenBufferMemory()), how does the ConversationalRetrievalChain leverage chat_history? based on my analysis, the chat_history only makes an impact on rephrasing the question given the argument rephrase_question

dosubot[bot] commented 11 months ago

🤖

Hello,

You're correct in your understanding that ConversationalRetrievalChain leverages chat_history to provide context for the current question. This is done in the _call and _acall methods of the BaseConversationalRetrievalChain class. The chat_history is converted into a string using the get_chat_history method (or the default _get_chat_history function if no custom method is provided). This string is then passed to the question_generator chain, which generates a new standalone question based on the current question and the chat history.

The rephrase_question argument determines how the new standalone question is used. If rephrase_question is set to True, the new standalone question is passed to the combine_docs_chain along with the chat history. If rephrase_question is set to False, the original question is passed to the combine_docs_chain instead, and the new standalone question is only used for document retrieval.

Here is the relevant code:

get_chat_history = self.get_chat_history or _get_chat_history
chat_history_str = get_chat_history(inputs["chat_history"])
if chat_history_str:
    callbacks = _run_manager.get_child()
    new_question = self.question_generator.run(
        question=question, chat_history=chat_history_str, callbacks=callbacks
    )
else:
    new_question = question
...
if self.rephrase_question:
    new_inputs["question"] = new_question
new_inputs["chat_history"] = chat_history_str
answer = self.combine_docs_chain.run(
    input_documents=docs, callbacks=_run_manager.get_child(), **new_inputs
)
output[self.output_key] = answer

This process allows the ConversationalRetrievalChain to generate responses that are contextually relevant to the ongoing conversation, and to adapt the question used for document retrieval and response generation based on the rephrase_question setting.

You can find more details in the source code of the BaseConversationalRetrievalChain class in the LangChain repository: libs/langchain/langchain/chains/conversational_retrieval/base.py

I hope this helps! If you have any more questions, feel free to ask.

Sources

libs/langchain/langchain/chains/conversational_retrieval/base.py

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

timxieICN commented 11 months ago

@dosu-bot Thanks, yes, I'm on the same page. I was asking if the memory or chat_history is used anywhere else other than in rephrase_question within ConversationalRetrievalChain.llm(memory=ConversationTokenBufferMemory())

dosubot[bot] commented 11 months ago

🤖

Yes, the chat_history or memory in the ConversationalRetrievalChain.llm(memory=ConversationTokenBufferMemory()) implementation is indeed used in other functionalities besides rephrase_question.

In the _call and _acall methods of the BaseConversationalRetrievalChain class, the chat_history is used to generate a new question and is also passed along to the combine_docs_chain if rephrase_question is set to True.

Here is the relevant code:

def _call(
    self,
    inputs: Dict[str, Any],
    run_manager: Optional[CallbackManagerForChainRun] = None,
) -> Dict[str, Any]:
    ...
    get_chat_history = self.get_chat_history or _get_chat_history
    chat_history_str = get_chat_history(inputs["chat_history"])
    if chat_history_str:
        ...
        new_question = self.question_generator.run(
            question=question, chat_history=chat_history_str, callbacks=callbacks
        )
    else:
        new_question = question
    ...
    if self.rephrase_question:
        new_inputs["question"] = new_question
    new_inputs["chat_history"] = chat_history_str
    answer = self.combine_docs_chain.run(
        input_documents=docs, callbacks=_run_manager.get_child(), **new_inputs
    )
    ...
    return output

In this code, chat_history_str is derived from chat_history and is used to generate a new question (new_question). If rephrase_question is True, new_question is used as the new input question. Regardless of the value of rephrase_question, chat_history_str is always passed along to the combine_docs_chain.

You can find this code in the libs/langchain/langchain/chains/conversational_retrieval/base.py file in the LangChain repository.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

timxieICN commented 11 months ago

@dosu-bot: Then how does self.combine_docs_chain use the memory or chat_history?

dosubot[bot] commented 11 months ago

🤖

The self.combine_docs_chain function in the ConversationalRetrievalChain.llm(memory=ConversationTokenBufferMemory()) implementation uses the chat_history or memory as part of the input to generate a new question and to generate the final response.

In the _call and _acall methods of the BaseConversationalRetrievalChain class, the chat_history is first converted into a string format using the get_chat_history method. If the chat_history is not empty, it is used along with the current question to generate a new standalone question using the question_generator chain. This new question is then used to retrieve relevant documents.

After the documents are retrieved, the chat_history is again used as part of the input to the combine_docs_chain function to generate the final response. The combine_docs_chain function takes in the retrieved documents, the chat_history, and either the new generated question or the original question (depending on the value of rephrase_question), and generates a final response.

Here is the relevant code:

def _call(
    self,
    inputs: Dict[str, Any],
    run_manager: Optional[CallbackManagerForChainRun] = None,
) -> Dict[str, Any]:
    ...
    get_chat_history = self.get_chat_history or _get_chat_history
    chat_history_str = get_chat_history(inputs["chat_history"])
    if chat_history_str:
        ...
        new_question = self.question_generator.run(
            question=question, chat_history=chat_history_str, callbacks=callbacks
        )
    else:
        new_question = question
    ...
    new_inputs = inputs.copy()
    if self.rephrase_question:
        new_inputs["question"] = new_question
    new_inputs["chat_history"] = chat_history_str
    answer = self.combine_docs_chain.run(
        input_documents=docs, callbacks=_run_manager.get_child(), **new_inputs
    )
    output[self.output_key] = answer
    ...
    return output

The same logic is applied in the asynchronous version of the method (_acall). You can find more details in the source code.

I hope this answers your question. If you have any more questions or need further clarification, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

SunXianyong commented 9 months ago

@dosubot: I'm developing based on your discussions. But I faced a new problem：After I integrated create_citation_fuzzy_match_chain and StuffDocumentsChain. The chunk parameter of the on_llm_new_token function in my callbacks is always ''. How can I ensure that my streaming output only contains chat's answer content?

langchain-ai / langchain