Indexes
Try out all the code in this Google Colab.
Installation
pip install langchain
LLMs
LangChain provides a generic interface for many different LLMs. Most of them work via their API but you can also run local models.
pip install openai
import os
os.environ["OPENAI_API_KEY"] ="YOUR_OPENAI_TOKEN"
from langchain.llms import OpenAI
llm = OpenAI(temperature=0.9) # model_name="text-davinci-003"
text = "What would be a good company name for a company that makes colorful socks?"
print(llm(text))
pip install huggingface_hub
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "YOUR_HF_TOKEN"
from langchain import HuggingFaceHub
# https://huggingface.co/google/flan-t5-xl
llm = HuggingFaceHub(repo_id="google/flan-t5-xl", model_kwargs={"temperature":0, "max_length":64})
llm("translate English to German: How old are you?")
Prompt Templates
LangChain faciliates prompt management and optimization.
Normally, when you use an LLM in an application, you are not sending user input directly to the LLM. Instead, you need to take the user input and construct a prompt, and only then send that to the LLM.
llm("Can Barack Obama have a conversation with George Washington?")
A better prompt is this:
prompt = """Question: Can Barack Obama have a conversation with George Washington?
Let's think step by step.
Answer: """
llm(prompt)
This can be achieved with PromptTemplates:
from langchain import PromptTemplate
template = """Question: {question}
Let's think step by step.
Answer: """
prompt = PromptTemplate(template=template, input_variables=["question"])
prompt.format(question="Can Barack Obama have a conversation with George Washington?")
Chains
Combine LLMs and Prompts in multi-step workflows.
from langchain import LLMChain
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "Can Barack Obama have a conversation with George Washington?"
print(llm_chain.run(question))
Agents and Tools
Agents involve an LLM making decisions about which cctions to take, taking that cction, seeing an observation, and repeating that until done.
When used correctly agents can be extremely powerful. In order to load agents, you should understand the following concepts:
Tool: A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. See available Tools.
from langchain.agents import load_tools
from langchain.agents import initialize_agent
pip install wikipedia
from langchain.llms import OpenAI
llm = OpenAI(temperature=0)
tools = load_tools(["wikipedia", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)
agent.run("In what year was the film Departed with Leopnardo Dicaprio released? What is this year raised to the 0.43 power?")
Memory
Add state to Chains and Agents.
Memory is the concept of persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.
from langchain import OpenAI, ConversationChain
llm = OpenAI(temperature=0)
conversation = ConversationChain(llm=llm, verbose=True)
conversation.predict(input="Hi there!")
conversation.predict(input="Can we talk about AI?")
conversation.predict(input="I'm interested in Reinforcement Learning.")
Combining language models with your own text data is a powerful way to differentiate them. The first step in doing this is to load the data into documents (i.e., some pieces of text). This module is aimed at making this easy.
from langchain.document_loaders import NotionDirectoryLoader
loader = NotionDirectoryLoader("Notion_DB")
docs = loader.load()
Indexes
Indexes refer to ways to structure documents so that LLMs can best interact with them. This module contains utility functions for working with documents
Embeddings: An embedding is a numerical representation of a piece of information, for example, text, documents, images, audio, etc.
Text Splitters: When you want to deal with long pieces of text, it is necessary to split up that text into chunks.
Vectorstores: Vector databases store and index vector embeddings from NLP models to understand the meaning and context of strings of text, sentences, and whole documents for more accurate and relevant search results. See [available vectorstores](vectorstore: https://python.langchain.com/en/latest/modules/indexes/vectorstores.html).
Langchain Architecture
Overview:
Installation
LLMs
LangChain provides a generic interface for many different LLMs. Most of them work via their API but you can also run local models.
Prompt Templates
LangChain faciliates prompt management and optimization.
Normally, when you use an LLM in an application, you are not sending user input directly to the LLM. Instead, you need to take the user input and construct a prompt, and only then send that to the LLM.
A better prompt is this:
This can be achieved with PromptTemplates:
Chains
Combine LLMs and Prompts in multi-step workflows.
Agents and Tools
Agents involve an LLM making decisions about which cctions to take, taking that cction, seeing an observation, and repeating that until done.
When used correctly agents can be extremely powerful. In order to load agents, you should understand the following concepts:
pip install wikipedia
Memory
Add state to Chains and Agents.
Memory is the concept of persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.
Document Loaders¶
Combining language models with your own text data is a powerful way to differentiate them. The first step in doing this is to load the data into documents (i.e., some pieces of text). This module is aimed at making this easy.
See all available Document Loaders.
Indexes
Indexes refer to ways to structure documents so that LLMs can best interact with them. This module contains utility functions for working with documents
url = "https://raw.githubusercontent.com/hwchase17/langchain/master/docs/modules/state_of_the_union.txt" res = requests.get(url) with open("state_of_the_union.txt", "w") as f: f.write(res.text)
Document Loader
from langchain.document_loaders import TextLoader loader = TextLoader('./state_of_the_union.txt') documents = loader.load()
Text Splitter
from langchain.text_splitter import CharacterTextSplitter text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) docs = text_splitter.split_documents(documents) pip install sentence_transformers
Embeddings
from langchain.embeddings import HuggingFaceEmbeddings embeddings = HuggingFaceEmbeddings()
text = "This is a test document."
query_result = embeddings.embed_query(text)
doc_result = embeddings.embed_documents([text])
pip install faiss-cpu from langchain.vectorstores import FAISS
db = FAISS.from_documents(docs, embeddings)
query = "What did the president say about Ketanji Brown Jackson" docs = db.similarity_search(query) print(docs[0].page_content)
Save and load:
db.save_local("faiss_index") new_db = FAISS.load_local("faiss_index", embeddings) docs = new_db.similarity_search(query) print(docs[0].page_content)