Practical LLMs - Githubissues

manisnesan commented 1 year ago

Landing Section

https://github.com/Aggregate-Intellect/practical-llms/blob/main/README.md

The Emergence of KnowledgeOps
[x] LLMOps: Expanding the Capabilities of Language Models with External Tools - See the respective comment
Leveraging Language Models for Training Data Generation and Tool Learning

Update: Twitter thread, Slides and Recording available as of Mar 14, 2023

LLM Interfaces Workshop and Hackathon

https://lu.ma/llm-interfaces - Apr 28, 2023

Excellent talks

State of GPT talk by Andrej Karpathy | Youtube | Summary | Wayde Gilliam tips on Retrieval LLM | Notes
Augmented Language Model from LLM Bootcamp | Youtube | Summary

Considerations

Where and when to use LLMs? Determine the task complexity and Data Drift
Prompt Engineering
Cost Analysis
Hosting vs API considerations based on Cost Analysis
Model Distillation

When the task at hand has low complexity and low drift, it may be possible to generate outputs with LLMs to train an in-house model. This can help reduce the cost of using LLMs, which is especially relevant for companies using LLMs at scale for chatbots or summarization.
Evaluation Methodologies)
Others ( Access Patterns such as natural lang interfaces, Longer Contexts, Hallucination)

Source : Pratik Pakodas - Substack

Courses

W&B - Building LLM Powered Apps
LangChain Chat with your data
Generative AI Courses
Building Systems with the ChatGPT API, with OpenAI’s
LangChain for LLM Application Development, with LangChain’s
How Diffusion Models Work, by

Check them out: deeplearning.ai/short-courses/

manisnesan commented 1 year ago

https://gist.github.com/joeddav/a11e5cc0850f0e540324177a53b547ae

Python wrapper around ChatGPT API

manisnesan commented 1 year ago

https://til.simonwillison.net/gpt3/chatgpt-api

manisnesan commented 1 year ago

Prompt Engineering Guide

Instruction: Tell the model what you want

Extract the name of the author from the quotation below.

“Some humans theorize that intelligent species go extinct before they can expand into outer space. If they're correct, then the hush of the night sky is the silence of the graveyard.” ― Ted Chiang, Exhalation Output:

Ted Chiang

Completion: Induce the model to complete the beginning of what you want

“Some humans theorize that intelligent species go extinct before they can expand into outer space. If they're correct, then the hush of the night sky is the silence of the graveyard.” ― Ted Chiang, Exhalation

The author of this quote is Output:

Ted Chiang

Demonstration: Show the model what you want, with either: A few examples in the prompt Many hundreds or thousands of examples in a fine-tuning training dataset - Few Shot prompts

Quote: “When the reasoning mind is forced to confront the impossible again and again, it has no choice but to adapt.” ― N.K. Jemisin, The Fifth Season Author: N.K. Jemisin

Quote: “Some humans theorize that intelligent species go extinct before they can expand into outer space. If they're correct, then the hush of the night sky is the silence of the graveyard.” ― Ted Chiang, Exhalation Author: Output:

Ted Chiang

Source : Openai cookbook

Tricks

Source: cohere.ai - prompt Engineering • Give clearer instructions – the more explicit you articulate the desired task, it's inputs, and outputs, the better the results will be.

• Ask the model to answer as if it was an expert.

• Supply better examples. If you're demonstrating examples in your prompt, make sure that your examples are diverse and high quality.

• Prompt the model to explain its reasoning using a prefix like "Let's think step by step". This is known as chain-of-thought reasoning

• Generate many outputs, and then use the model to pick the best one (an example of iterative refinement)

• If you're still having trouble, try splitting complex tasks into simpler subtasks

Takeaways

be effective at writing instructions, selecting samples for few-shot, batching, breaking the task into steps & balancing quality/cost tradeoffs.

Blogs

the two that immediately came to mind: https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/ and https://www.oneusefulthing.org/p/a-guide-to-prompting-ai-for-what and this repo also gave me lots of ideas: https://github.com/f/awesome-chatgpt-prompts

Additional Resources for Prompt Engineering 1️⃣ Microsoft - Introduction to prompt engineering https://lnkd.in/dAfVftGG

2️⃣ Chip Huyen - Building LLM applications for production https://lnkd.in/dEiZqzpT

3️⃣ DAIR.AI - Prompt Engineering Guide https://lnkd.in/dXZdgM7e

4️⃣ PromptsLab - Awesome Prompt Engineering https://lnkd.in/dF9naw2a

5️⃣ Lilian Weng (OpenAI) - Prompt Engineering https://lnkd.in/dBTJRpZd

6️⃣ Microsoft - Prompt engineering techniques https://lnkd.in/dDzsJXhb

7️⃣ Cobus Greyling - Generative AI Prompt Pipelines https://lnkd.in/d5U5XkFd

8️⃣ Xavier (Xavi) Amatriain (LinkedIn) - Prompt Engineering 101 - Introduction and resources https://lnkd.in/drT8Z_Rq

9️⃣ Xavier (Xavi) Amatriain - Prompt Engineering: How to Talk to the AIs (course) https://lnkd.in/dGKqcWFj

manisnesan commented 1 year ago

Paper - Language models and cognitive automation for economic research

manisnesan commented 1 year ago

Retrieval

Problem: ChatGPT - it doesn't know about YOUR data

Solution:

The main way of doing this is through a process commonly referred to as "Retrieval Augmented Generation".
In this process, rather than just passing a user question directly to a language model, the system "retrieves" any documents that could be relevant in answering the question, and then passes those documents (along with the original question) to the language model for a "generation" step.
The main way most people - including us at LangChain - have been doing retrieval is by using semantic search

manisnesan commented 1 year ago

Quoting Simon Willison analogy for language models

Think of language models like ChatGPT as a “calculator for words”

Source

manisnesan commented 1 year ago

https://arxiv.org/pdf/2304.03153v1.pdf Twitter

manisnesan commented 1 year ago

cookbook

manisnesan commented 1 year ago

Twitter thread - how do we overcome 4096 token limit in OpenAI GPT requests?

manisnesan commented 1 year ago

Prompt Injection example to leak system prompts

manisnesan commented 1 year ago

Idea

To enhance an existing Large Language Model with custom knowledge, there are 2 main methods:

Fine-tuning: further training the model using the custom dataset.
In context learning: supplying the necessary information from the custom dataset related to user query while querying.

Pros & Cons

Fine-tuning provides a high degree of accuracy and completeness but it requires significant time and resources to train and host the custom model.
In-context learning, on the other hand, offers greater flexibility and costs much less, but it’s limited by the model’s token limit.

Prompt Engineering and Retrieval Augmented Generation

instead of simply it’d be: answer following question given ,

Source : Integrating ChatGPT with internal KB and Q&A

manisnesan commented 1 year ago

Demonstrate-Search-Predict

dsp - GitHub

manisnesan commented 1 year ago

Dolly - Open source Instruction tuned LLM

manisnesan commented 1 year ago

Simple poc

Take an @arxiv AI research paper
With the help of @LangChainAI, embed the data and store the data in @trychroma
Use @streamlit to create a chatbot to chat with the data

Similar idea with pinecone embedding Youtube

manisnesan commented 1 year ago

Jeremy posed a Challenge if this can be written without langchain for comparison purpose in this tweet

manisnesan commented 1 year ago

Q&A over fsdl corpus

manisnesan commented 1 year ago

mood board:

https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators https://arxiv.org/abs/2302.07842 https://www.pinecone.io/learn/langchain/ https://arxiv.org/abs/2211.04325

manisnesan commented 1 year ago

Why In house LLMs : fine tune an open source LLM on one's own data

Source : Chip Huyen Tweet

manisnesan commented 1 year ago

"Help me build my intuition about ... "

A magic spell to learn anything

some examples, how to -get a fat ass -build a company -build an iOS app using the share sheet api -do advanced meditation techniques

Also add " with a memorable metaphor "

manisnesan commented 1 year ago

Expanding the Capabilities of Language Models with External Tools

Use case 1: Can LLM answer the question (using APIChain) "Who was the CTO of Apple when it's share prices was at its lowest point in the last 10 years"

Use case 2: "Find me flights from Toronto to Bangalore flying out on June 27 & returning on Aug 12, without transiting via the U.S and without involving a self-transfer"

Lean Approach: User Query -> [sequence of external API/data store calls] -> LLM synthesizes the answer Holy Grail: User Query -> LLM decomposes query -> calls appropriate external services where needed -> combines responses in coherent answer in requested format

OpenChatKit

20B Instruction-tuned model based on GPT Neo-X trained on 43M instructions
Extensible retrieval system to integrate APIs or external data sources
Fine-tuning playbook for domain adaptation
Moderation model - fine tuned on GPT JT-68 (this does not use reinforcement learning)

LlamdaIndex Features

DataConnectors | Indices | Controllable queries
DataConnectors: Plain text, BeautifulSoup
Indices:
- List Index (sequence of chunks) Why: If document list does not fit into the context, they take first Node 1 as Context , get the response and then use Node 2 along with the response and ask LLM to refine the answer.
- Table Index: Perfor keyword extraction from the query, identifies the Node containing the doc and then add that as the Context.
- VectorStore Index
- Tree Index: Ability to compose indices on each other
- SQL and KG Index: Use LLM to convert from NL query to sql query. Automated conversion from unstructured data to DB form. Use LLM to extract relation triplets from data & represent it in knowledge graph.
Controllable queries : require_keywords / exclude_keywords, tree_summarize, compact, default ( create & refine)

LangChain Utilities

Python interpreter, Bash Shell, HTTP Requests, Google/Bing/SERP, Wolfram Alpha
LLMMath Chain: User Query representing math query -> Python code representing math query -> Python REPL. | Customize the prompt to ensure specific libraries are used.
Program Aided LMs: Reasoning query -> Task decomposition -> Python code representing math calc -> Python REPL
SQLDatabase Chain: NL user query -> SQL Query -> Run SQL Query -> Format results in coherent answer. | Provide DB schema, number of rows to be returned, example rows in Prompt Template
API Chain : NL user query -> Construct API Parameters -> API call -> Rephrase results in desired format | Provide API Docs in Query Template
LLMRequests Chain: NL user query -> Construct URL -> Use requests to call URL -> Rephrase results in desired format. | Ex: "weather in Paris?"
ModerationChain: NL user query -> ModerationChain -> ActualLLM -> ModerationChain -> SystemResponse. | Used to combat toxic lang/inappropriate requests | Similarly use SequentialChain to block PII leakage, fact-check responses
LangChain Agents (Most Difficult & exciting part of research to do task decomposition) #29

Task decomposition

See the resources from the README

Next

Promise of cross-document IR: What if information need is solved by multiple Documents. Resources/documents were never meant to be universally first-class citizens. We are not looking for list of documents but what we are looking for is the answers with citations for most queries.

Resources

https://openchatkit.net/#faq
Retrieval example: https://github.com/togethercomputer/OpenChatKit/tree/main/retrieval

manisnesan commented 1 year ago

Using LangChain Workshop Solutions

Google Colab

Chains are the core
Simplest form of Chain are LLMChain: Input -> LLMChain (Prompt-> LLM) -> Output
Response we get is one hop. It does not remember the previous conversations. To add memory use ConversationChain:
Integrating External Knowledge

Questions:

What are the alternatives to OpenAI Embeddings for custom domain (may be fine tune the embedding model) ? How to ensure we can take care of out of vocabulary scenario?
- SentenceTransformer
What is the use the temperature?
- controls repeatability of the text generation of the model even with the exact same context

Ideas

Productionize: use fast api or flask and create two routes one for generating embedding and one is query for passing query in llm and getting response.

Resources

https://github.com/gkamradt/langchain-tutorials
There's collection of tutorials for quick starters as well https://github.com/gkamradt/langchain-tutorials. tutorials last week by Data Independent mentioned by Sergei.

manisnesan commented 1 year ago

LLMs in your own environment

Why

Data Moat: Internal KB, Customer Support, Custom Code, Banking, Telecom, Health, Social Media data
Cost: Training (Frequency/Size), Inference (Access Patterns), Embedding Cost (Data Drift), Network (Data Movement)
Uptime & Noisy Neighbours
Vendor Trust (eg: retail companies don't use AWS due to competitive relationships)
Network Security
IAM Security (you can only access the service only if you prove who you are)

Challenges

blog.replit.com/llm-training
LLMOps is MLOps but much harder)

Resources

check out Denys's prev talk: https://github.com/Aggregate-Intellect/practical-llms/blob/main/README.md#integrating-llms-into-your-product-considerations-and-best-practices
[ ] LLM Benchmarking: https://podcasts.apple.com/us/podcast/ai-fundamentals-benchmarks-101/id1674008350?i=1000607803441 (TODO: Raise a PR)

manisnesan commented 1 year ago

This is based on the idea from KnowledgeOps talk by Amir Feizpour

Generate Tasks to achieve Objective -> Prioritize -> Execute -> Reflect on Performance -> Ask for user feedback/input

Creating a lung cancer detection model using visual transformer using ChatGPT https://chat.openai.com/c/c990f5d0-39bc-4a74-b845-585a20bdf29f

Example https://github.com/Significant-Gravitas/Auto-GPT/blob/master/autogpt/config/ai_config.py screenshot-meet google com-2023 04 28-12_45_55

manisnesan commented 1 year ago

Exploring the limits of today's LLMs by Suhas Pai

LLM Evaluation

Cutting edge

Tool Integration & Workflows

Holy Grail: User Query & the rest of the thing is taken care of the by the agent based on task decomposition and chaining.
Limitations: Retrieval
RAG ( Retrieval Augmented Generation)
Limited by the retrieval performance
dependent on cosine similarity of embeddings. ( Paper: Problems with cosine as a measure of embedding similarity for high frequency words)
Chain of thought (step by step reasoning) is similar to Query expansion (useful way to think about it), rephrasing the queries.
Limitations: Context Length
precision (small context that's on point) vs recall (get all relevant things and hope that the llm will figure it out)
fine-tuning vs in-context: Finetuned task specific models always trumps GPT. Same with Domain adapted pretraining helps a lot.
Downside: gets more difficult to scale up fine-tuning as model gets bigger
In-house instruction tuning could work
Prompt Gists (compress): Helps to generalize to the prompts that it has not seen before during instruction tuning

Resources

https://instructor-embedding.github.io/

manisnesan commented 1 year ago

Challenges

difficult to interpret and understand, which can make it hard to trust their outputs or identify potential errors
struggle with tasks that require both reasoning and acting, such as decision making or problem solving

Solution

ReAct addresses this challenge by generating both a plan for how to perform a task and an explanation of why it's performing the task in that way. helps make the reasoning process more transparent and interpretable
ReAct addresses this challenge by interleaving reasoning and action generation during the training process, which allows the model to adapt its actions based on new information or unexpected inputs.

What

ReAct ( Reason, Act, Observe) works by having the computer generate both a plan for what it should do and an explanation of why it's doing it.
helps the computer adapt its plan if something unexpected happens.
Self-refine : Keep refining output by getting feedback till the acceptance condition is reached. it helps the LLM improve its own text by giving it feedback and letting it try again.
it can be used to improve the quality of generated text in chatbots, virtual assistants, and other conversational agents. It can also be used to generate more accurate and specific responses in question-answering systems. In addition, SELF-REFINE can be applied to tasks such as summarization, translation, and sentiment analysis to improve the quality of generated outpu

manisnesan commented 1 year ago

Peter Bull's (from DrivenData) notes from the Full Stack Deep Learning LLM Bootcamp:

"Our new baseline for all NLP tasks will be asking an LLM to do the task."

Harnessing LLMs: Part I

https://www.linkedin.com/pulse/harnessing-llms-part-i-peter-bull/

manisnesan commented 1 year ago

Llm - Practical guide

manisnesan commented 1 year ago

Intro to Language safety - MLOpsLearners

manisnesan commented 1 year ago

LLM Observability

Extracts embeddings for the data
Reduces the dimensionality of the data with UMAP
Clusters the data using HDBSCAN
Generates cluster summaries with the help of GPT-4
Visualizes the data in lower dimensions within the ChatGPT interface

https://arize.com/blog/building-chatgpt-plugin/

manisnesan commented 1 year ago

Private GPT

manisnesan commented 1 year ago

Fine tuning LLMs on custom domain

https://armandolivares.tech/2023/04/22/how-to-fine-tune-a-model-like-chatgpt-in-spanish-using-alpaca-lora/

https://twitter.com/eugeneyan/status/1657412697678577671?s=46&t=aOEVGBVv9ICQLUYL4fQHlQ

manisnesan commented 1 year ago

LlamaIndex - Local only models : Google Colab

manisnesan commented 1 year ago

LLM university - cohere

manisnesan commented 1 year ago

https://docs.gpt4all.io/gpt4all_chat.html

manisnesan commented 1 year ago

Fine tuning red pajama OSS LLM

@johnrobinsn shows how to take a base model and instruction tune it using the Alpaca dataset, including the steps required to prepare the data - but we can adapt this easily to our own data.

manisnesan commented 1 year ago

brexhq prompt engineering

manisnesan commented 1 year ago

John Berryman quote from Inside GitHub working with LLMS behind GitHub copilot

The secret is that we don’t just have to provide the model with the original file that the GitHub Copilot user is currently editing; instead we look for additional pieces of context inside the IDE that can hint the model towards better completions.”

He continues, “There have been several changes that helped get GitHub Copilot where it is today, but one of my favorite tricks was when we pulled similar texts in from the user’s neighboring editor tabs. That was a huge lift in our acceptance rate and characters retained.”

manisnesan commented 1 year ago

Generative vs Extractive https://haystack.deepset.ai/blog/generative-vs-extractive-models/

manisnesan commented 1 year ago

State of GPT Karpathy talk notes Youtube

manisnesan commented 1 year ago

build a ChatGPT using llamaindex and mongodb

manisnesan commented 1 year ago

LLama Index with local only models

manisnesan commented 1 year ago

Wayde Gilliam tips on Retrieval LLM

Tell the model that it should only use the provided documents as context and provide citations.

Example: You many ONLY use the provided documents to provide your answer and you should always include a citation that is the document's Id where it is used in the answer.

manisnesan commented 1 year ago

Mini chain

https://srush-minichain.hf.space

https://srush.github.io/MiniChain/examples/qa/

https://github.com/explosion/spacy-llm

manisnesan commented 1 year ago

From @nirantk

How can you always keep your answers fresh and automatically updated?

Tutorial Colab Notebook here w/ Detailed Diagrams: bit.ly/updatedQAColab

Built with (@CohereAI + @OpenAI + @qdrant_engine) using @llama_index

manisnesan commented 1 year ago

Production

All the Hard Stuff Nobody Talks About when Building Products with LLMs

Phillip Carter shares lessons learned building LLM features for Honeycomb - hard won knowledge from building a query assistant for turning human questions into Honeycomb query filters.

manisnesan commented 1 year ago

https://weaviate.io/blog/llms-and-search

manisnesan commented 12 months ago

Building LLM Applications for Production - Chip Huyen

Discusses the MLOps Challenges such as the following in building LLM applications in production

https://home.mlops.community/home/videos/building-llm-applications-for-production

Yet to review the other talks that I missed

LLMs in Production Conference - Part 2 (Jun 2023) https://home.mlops.community/home/collections/llms-in-production-conference-part-ii-2023-06-21

Part 1 can found in this link (Apr 2023) https://home.mlops.community/home/collections/llms-in-production-conference-2023-04-13

manisnesan commented 11 months ago

h2ogpt

https://arxiv.org/pdf/2306.08161.pdf

manisnesan / til

Practical LLMs #26

Landing Section

LLM Interfaces Workshop and Hackathon

Excellent talks

Considerations

Courses

Tricks

Takeaways

Blogs

Expanding the Capabilities of Language Models with External Tools

OpenChatKit

LlamdaIndex Features

LangChain Utilities

Using LangChain Workshop Solutions

Questions:

Ideas

Resources

LLMs in your own environment

Resources

Exploring the limits of today's LLMs by Suhas Pai

LLM Evaluation

Cutting edge

Tool Integration & Workflows

Limitations: Retrieval

Limitations: Context Length

Resources