YunchaoYang / Blogs

blogs and notes, https://yunchaoyang.github.io/blogs/
0 stars 0 forks source link

LangChain #51

Open YunchaoYang opened 8 months ago

YunchaoYang commented 8 months ago

What is langchain

LangChain is a framework for developing applications powered by language models. This framework consists of LangChain Libraries, LangChain Templates, LangServe, and LangSmith.

  1. Does LangChain Implement RAG algorithms? Yes
  2. LangChain provides standard, extendable interfaces and integrations for the following modules: Model I/O, Retrieval,Agents.
YunchaoYang commented 8 months ago

SentenceTransformers

What is SentenceTransformers

SentenceTransformers is a Python library that provides pre-trained models for generating dense vector representations of sentences or text. These dense vector representations, also known as embeddings, capture the semantic meaning of the input text. These embeddings can then be used for various natural language processing (NLP) tasks such as semantic similarity calculation, clustering, classification, and information retrieval.

Why need SentenceTransformers ?

SentenceTransformers is needed because it simplifies the process of generating high-quality sentence embeddings. Before the advent of SentenceTransformers and similar libraries, generating effective embeddings for sentences required significant expertise in NLP, deep learning, and access to large-scale compute resources for training models from scratch. SentenceTransformers provides pre-trained models that have been trained on large text corpora, which saves time and resources for practitioners and researchers. By using pre-trained models, users can obtain meaningful sentence representations without the need for extensive training data or computational resources.

Furthermore, SentenceTransformers offers a user-friendly interface that allows developers to easily integrate sentence embedding functionality into their NLP applications, making it accessible to a wider range of users. Overall, SentenceTransformers simplifies the process of leveraging advanced NLP techniques for tasks that require understanding the semantic similarity or meaning of sentences.

What are the models used in SentenceTransformers

SentenceTransformers primarily uses transformer-based architectures, such as BERT (Bidirectional Encoder Representations from Transformers) and RoBERTa (Robustly optimized BERT approach), for generating embeddings of sentences or text. These transformer models are pre-trained on large corpora of text data using unsupervised learning techniques, such as masked language modeling and next sentence prediction.

Usage

Install

pip install -U sentence-transformers

use it

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")

# Our sentences to encode
sentences = [
    "This framework generates embeddings for each input sentence",
    "Sentences are passed as a list of string.",
    "The quick brown fox jumps over the lazy dog."
]

# Sentences are encoded by calling model.encode()
embeddings = model.encode(sentences)

# Print the embeddings
for sentence, embedding in zip(sentences, embeddings):
    print("Sentence:", sentence)
    print("Embedding:", embedding)
    print("")

Pretrained Models

https://www.sbert.net/docs/pretrained_models.html

YunchaoYang commented 7 months ago

Example on LLM-powered Chatbot

It consists of four high-level components:

1 Chat models

Because chatbot is built on messages, rather than raw text. So Chat Models serve the purpose of conversational tone and message interface better than text LLMs, but raw LLM can be chatbots as well.

Example


from langchain_openai import ChatOpenAI
chat = ChatOpenAI(model="gpt-3.5-turbo-1106", temperature=0.2)

# invoke the chatmodel with output as `AIMessage`
from langchain_core.messages import HumanMessage
chat.invoke(
    [
        HumanMessage(
            content="Translate this sentence from English to French: I love programming."
        )
    ]
)

# pass the entire conversation history into the model

from langchain_core.messages import AIMessage

chat.invoke(
    [
        HumanMessage(
            content="Translate this sentence from English to French: I love programming."
        ),
        AIMessage(content="J'adore la programmation."),
        HumanMessage(content="What did you just say?"),
    ]
)
# The output is 
`AIMessage(content='I said "J\'adore la programmation," which means "I love programming" in French.')
` 

2 Prompt Templates

To automatic the prompt messaging better for LLM to understand, a sets of user-defined templates for messaging with key components as input variables while keeping the reusable boiler plate text keeps still in the template.

Example

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | chat # MessagesPlaceholder piping the prompt to the previously defined chat model

# then we can invoke the chain again 
chain.invoke(
    {
        "messages": [
            HumanMessage(
                content="Translate this sentence from English to French: I love programming."
            ),
            AIMessage(content="J'adore la programmation."),
            HumanMessage(content="What did you just say?"),
        ],
    }
)

3 Chat history

Allow chatbot to remember past interactions and take them into account when responding to the followup questions

from langchain.memory import ChatMessageHistory

demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message("hi!")

demo_ephemeral_chat_history.add_ai_message("whats up?")

demo_ephemeral_chat_history.messages # This will print out all messages 

demo_ephemeral_chat_history.add_user_message(
    "Translate this sentence from English to French: I love programming."
)

response = chain.invoke({"messages": demo_ephemeral_chat_history.messages})
# AIMessage(content='The translation of "I love programming" in French is "J\'adore la programmation."')

demo_ephemeral_chat_history.add_ai_message(response) # add respond to the chat history

demo_ephemeral_chat_history.add_user_message("What did you just say?") # ask another question

chain.invoke({"messages": demo_ephemeral_chat_history.messages}) # get AI response with chat history
# AIMessage(content='I said "J\'adore la programmation," which is the French translation for "I love programming."')

4 Retrievers

domain-specific, knowledge dataset as context to augment LLM response to provide sources, used in RAG. A retriever is an interface that returns documents given an unstructured query, more than just a vector store.

Use LangSmith documentation as source material and store it in a vector store for a later retrieval..

from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://docs.smith.langchain.com/overview")
data = loader.load()

Next, we split it into smaller chunks that the LLM’s context window can handle and store it in a vector database:

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)

Then we embed and store those chunks in a vector database:

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())

And finally, let’s create a retriever from our initialized vectorstore:

# k is the number of chunks to retrieve
retriever = vectorstore.as_retriever(k=4)
docs = retriever.invoke("how can langsmith help with testing?")
docs

Handling documents

Combine with the above-mentioned component to create RAG model:

from langchain.chains.combine_documents import create_stuff_documents_chain

chat = ChatOpenAI(model="gpt-3.5-turbo-1106")

question_answering_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Answer the user's questions based on the below context:\n\n{context}",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

document_chain = create_stuff_documents_chain(chat, question_answering_prompt) ### Combined with prompt with model

invoke the document chain with raw documents we retrieved above:

from langchain.memory import ChatMessageHistory

demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message("how can langsmith help with testing?")

### Combine chat messages with document chain
document_chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
        "context": docs, 
    }
)
# 'LangSmith can assist with testing by providing the capability to quickly edit examples and add them to datasets. This allows for the expansion of evaluation sets or fine-tuning of a model for improved quality or reduced costs. Additionally, LangSmith simplifies the construction of small datasets by hand, providing a convenient way to rigorously test changes in the application.'

Creating a retrieval chain

integrate the retriever to gain relevant information to the last message. Pass context plus the previous message to generate a final answer.

from typing import Dict

from langchain_core.runnables import RunnablePassthrough

# get last message
def parse_retriever_input(params: Dict):
    return params["messages"][-1].content

## use the RunnablePassthrough.assign() method to pass intermediate steps through at each invocation.
retrieval_chain = RunnablePassthrough.assign(
    context=parse_retriever_input | retriever,
).assign(
    answer=document_chain,
)

get response with last message with contxt

response = retrieval_chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    }
)

continue the conversation by adding the response back to chat history

demo_ephemeral_chat_history.add_ai_message(response["answer"])

demo_ephemeral_chat_history.add_user_message("tell me more about that!")

retrieval_chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    },
)

If you don’t want to return all the intermediate steps, you can define your retrieval chain like this using a pipe directly into the document chain instead of the final .assign() call:

retrieval_chain_with_only_answer = (
    RunnablePassthrough.assign(
        context=parse_retriever_input | retriever,
    )
    | document_chain
)

retrieval_chain_with_only_answer.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    },
)

Query transformation

tell me more about that!, the answer don't directly include information about testing becaseu we are passing tell me more about that! as a query to the retriever. The ouptu tof the retrieve chain is still okay because the document chain retrieval chain can generate an answer based on the chat history, but we can retrieve better answer based on chat history.

To get around the problem, let’s add a query transformation step that removes references from the input. What we do is to wrap our old retriever with a transformed search query:

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableBranch

# We need a prompt that we can pass into an LLM to generate a transformed search query

chat = ChatOpenAI(model="gpt-3.5-turbo-1106", temperature=0.2)

# this is the new template for query and transform
query_transform_prompt = ChatPromptTemplate.from_messages(
    [
        MessagesPlaceholder(variable_name="messages"),
        (
            "user",
            "Given the above conversation, generate a search query to look up in order to get information relevant to the conversation. Only respond with the query, nothing else.", 
        ),
    ]
)

query_transforming_retriever_chain = RunnableBranch(
    (
        lambda x: len(x.get("messages", [])) == 1,
        # If only one message, then we just pass that message's content to retriever
        (lambda x: x["messages"][-1].content) | retriever,
    ),
    # If messages, then we pass inputs to LLM chain to transform the query, then pass to retriever
    query_transform_prompt | chat | StrOutputParser() | retriever,
).with_config(run_name="chat_retriever_chain")

With this query_transforming_retriever_chain, we can first pass inputs to LLM chains to transform the query to pass to retriever.

document_chain = create_stuff_documents_chain(chat, question_answering_prompt)

conversational_retrieval_chain = RunnablePassthrough.assign(
    context=query_transforming_retriever_chain,
).assign(
    answer=document_chain,
)

demo_ephemeral_chat_history = ChatMessageHistory()

And finally, let’s invoke it! demo_ephemeral_chat_history.add_user_message("how can langsmith help with testing?")

response = conversational_retrieval_chain.invoke( {"messages": demo_ephemeral_chat_history.messages}, )

demo_ephemeral_chat_history.add_ai_message(response["answer"])

response


Get a followup question, 
```python
demo_ephemeral_chat_history.add_user_message("tell me more about that!")

conversational_retrieval_chain.invoke(
    {"messages": demo_ephemeral_chat_history.messages}
)

References:

https://python.langchain.com/docs/use_cases/chatbots/quickstart