NeroCube / bookmark

Place some learning resources
0 stars 0 forks source link

LangChain Tutorial in Python #430

Open NeroCube opened 1 year ago

NeroCube commented 1 year ago

Langchain Architecture

image

Overview:

Installation

pip install langchain

LLMs

LangChain provides a generic interface for many different LLMs. Most of them work via their API but you can also run local models.

pip install openai
import os
os.environ["OPENAI_API_KEY"] ="YOUR_OPENAI_TOKEN"
from langchain.llms import OpenAI

llm = OpenAI(temperature=0.9)  # model_name="text-davinci-003"
text = "What would be a good company name for a company that makes colorful socks?"
print(llm(text))
pip install huggingface_hub
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "YOUR_HF_TOKEN"
from langchain import HuggingFaceHub
# https://huggingface.co/google/flan-t5-xl
llm = HuggingFaceHub(repo_id="google/flan-t5-xl", model_kwargs={"temperature":0, "max_length":64})

llm("translate English to German: How old are you?")

Prompt Templates

LangChain faciliates prompt management and optimization.

Normally, when you use an LLM in an application, you are not sending user input directly to the LLM. Instead, you need to take the user input and construct a prompt, and only then send that to the LLM.

llm("Can Barack Obama have a conversation with George Washington?")

A better prompt is this:

prompt = """Question: Can Barack Obama have a conversation with George Washington?

Let's think step by step.

Answer: """
llm(prompt)

This can be achieved with PromptTemplates:

from langchain import PromptTemplate

template = """Question: {question}

Let's think step by step.

Answer: """

prompt = PromptTemplate(template=template, input_variables=["question"])
prompt.format(question="Can Barack Obama have a conversation with George Washington?")

Chains

Combine LLMs and Prompts in multi-step workflows.

from langchain import LLMChain

llm_chain = LLMChain(prompt=prompt, llm=llm)

question = "Can Barack Obama have a conversation with George Washington?"

print(llm_chain.run(question))

Agents and Tools

Agents involve an LLM making decisions about which cctions to take, taking that cction, seeing an observation, and repeating that until done.

When used correctly agents can be extremely powerful. In order to load agents, you should understand the following concepts:

Memory is the concept of persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.

from langchain import OpenAI, ConversationChain

llm = OpenAI(temperature=0)
conversation = ConversationChain(llm=llm, verbose=True)

conversation.predict(input="Hi there!")
conversation.predict(input="Can we talk about AI?")
conversation.predict(input="I'm interested in Reinforcement Learning.")

Document Loaders

Combining language models with your own text data is a powerful way to differentiate them. The first step in doing this is to load the data into documents (i.e., some pieces of text). This module is aimed at making this easy.

See all available Document Loaders.

from langchain.document_loaders import NotionDirectoryLoader

loader = NotionDirectoryLoader("Notion_DB")

docs = loader.load()

Indexes

Indexes refer to ways to structure documents so that LLMs can best interact with them. This module contains utility functions for working with documents

url = "https://raw.githubusercontent.com/hwchase17/langchain/master/docs/modules/state_of_the_union.txt" res = requests.get(url) with open("state_of_the_union.txt", "w") as f: f.write(res.text)

Document Loader

from langchain.document_loaders import TextLoader loader = TextLoader('./state_of_the_union.txt') documents = loader.load()

Text Splitter

from langchain.text_splitter import CharacterTextSplitter text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) docs = text_splitter.split_documents(documents) pip install sentence_transformers

Embeddings

from langchain.embeddings import HuggingFaceEmbeddings embeddings = HuggingFaceEmbeddings()

text = "This is a test document."

query_result = embeddings.embed_query(text)

doc_result = embeddings.embed_documents([text])

pip install faiss-cpu from langchain.vectorstores import FAISS

db = FAISS.from_documents(docs, embeddings)

query = "What did the president say about Ketanji Brown Jackson" docs = db.similarity_search(query) print(docs[0].page_content)

Save and load:

db.save_local("faiss_index") new_db = FAISS.load_local("faiss_index", embeddings) docs = new_db.similarity_search(query) print(docs[0].page_content)


## Reference
- [https://www.python-engineer.com/posts/langchain-crash-course/](https://www.python-engineer.com/posts/langchain-crash-course/)
- [https://medium.com/@gil.fernandes/langchain-chat-with-custom-tools-functions-and-memory-e34daa331aa7](https://medium.com/@gil.fernandes/langchain-chat-with-custom-tools-functions-and-memory-e34daa331aa7)