Hannibal046 / Awesome-LLM

Awesome-LLM: a curated list of Large Language Model
Creative Commons Zero v1.0 Universal
18.82k stars 1.54k forks source link

Awesome-LLM Awesome

🔥 Large Language Models(LLM) have taken the NLP community AI community the Whole World by storm. Here is a curated list of papers about large language models, especially relating to ChatGPT. It also contains frameworks for LLM training, tools to deploy LLM, courses and tutorials about LLM and all publicly available LLM checkpoints and APIs.

Trending LLM Projects

Table of Content

Milestone Papers

Date keywords Institute Paper
2017-06 Transformers Google Attention Is All You Need
2018-06 GPT 1.0 OpenAI Improving Language Understanding by Generative Pre-Training
2018-10 BERT Google BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
2019-02 GPT 2.0 OpenAI Language Models are Unsupervised Multitask Learners
2019-09 Megatron-LM NVIDIA Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
2019-10 T5 Google Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
2019-10 ZeRO Microsoft ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
2020-01 Scaling Law OpenAI Scaling Laws for Neural Language Models
2020-05 GPT 3.0 OpenAI Language models are few-shot learners
2021-01 Switch Transformers Google Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
2021-08 Codex OpenAI Evaluating Large Language Models Trained on Code
2021-08 Foundation Models Stanford On the Opportunities and Risks of Foundation Models
2021-09 FLAN Google Finetuned Language Models are Zero-Shot Learners
2021-10 T0 HuggingFace et al. Multitask Prompted Training Enables Zero-Shot Task Generalization
2021-12 GLaM Google GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
2021-12 WebGPT OpenAI WebGPT: Browser-assisted question-answering with human feedback
2021-12 Retro DeepMind Improving language models by retrieving from trillions of tokens
2021-12 Gopher DeepMind Scaling Language Models: Methods, Analysis & Insights from Training Gopher
2022-01 COT Google Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
2022-01 LaMDA Google LaMDA: Language Models for Dialog Applications
2022-01 Minerva Google Solving Quantitative Reasoning Problems with Language Models
2022-01 Megatron-Turing NLG Microsoft&NVIDIA Using Deep and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
2022-03 InstructGPT OpenAI Training language models to follow instructions with human feedback
2022-04 PaLM Google PaLM: Scaling Language Modeling with Pathways
2022-04 Chinchilla DeepMind An empirical analysis of compute-optimal large language model training
2022-05 OPT Meta OPT: Open Pre-trained Transformer Language Models
2022-05 UL2 Google Unifying Language Learning Paradigms
2022-06 Emergent Abilities Google Emergent Abilities of Large Language Models
2022-06 BIG-bench Google Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
2022-06 METALM Microsoft Language Models are General-Purpose Interfaces
2022-09 Sparrow DeepMind Improving alignment of dialogue agents via targeted human judgements
2022-10 Flan-T5/PaLM Google Scaling Instruction-Finetuned Language Models
2022-10 GLM-130B Tsinghua GLM-130B: An Open Bilingual Pre-trained Model
2022-11 HELM Stanford Holistic Evaluation of Language Models
2022-11 BLOOM BigScience BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
2022-11 Galactica Meta Galactica: A Large Language Model for Science
2022-12 OPT-IML Meta OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
2023-01 Flan 2022 Collection Google The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
2023-02 LLaMA Meta LLaMA: Open and Efficient Foundation Language Models
2023-02 Kosmos-1 Microsoft Language Is Not All You Need: Aligning Perception with Language Models
2023-03 LRU DeepMind Resurrecting Recurrent Neural Networks for Long Sequences
2023-03 PaLM-E Google PaLM-E: An Embodied Multimodal Language Model
2023-03 GPT 4 OpenAI GPT-4 Technical Report
2023-04 LLaVA UW–Madison&Microsoft Visual Instruction Tuning
2023-04 Pythia EleutherAI et al. Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
2023-05 Dromedary CMU et al. Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
2023-05 PaLM 2 Google PaLM 2 Technical Report
2023-05 RWKV Bo Peng RWKV: Reinventing RNNs for the Transformer Era
2023-05 DPO Stanford Direct Preference Optimization: Your Language Model is Secretly a Reward Model
2023-05 ToT Google&Princeton Tree of Thoughts: Deliberate Problem Solving with Large Language Models
2023-07 LLaMA2 Meta Llama 2: Open Foundation and Fine-Tuned Chat Models
2023-10 Mistral 7B Mistral Mistral 7B
2023-12 Mamba CMU&Princeton Mamba: Linear-Time Sequence Modeling with Selective State Spaces
2024-01 DeepSeek-v2 DeepSeek DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
2024-03 Jamba AI21 Labs Jamba: A Hybrid Transformer-Mamba Language Model
2024-05 Mamba2 CMU&Princeton Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
2024-05 Llama3 Meta The Llama 3 Herd of Models

Other Papers

If you're interested in the field of LLM, you may find the above list of milestone papers helpful to explore its history and state-of-the-art. However, each direction of LLM offers a unique set of insights and contributions, which are essential to understanding the field as a whole. For a detailed list of papers in various subfields, please refer to the following link:

LLM Leaderboard

Open LLM

LLM Data

LLM Evaluation:

LLM Training Frameworks

LLM Deployment

Reference: llm-inference-solutions

  • SGLang - SGLang is a fast serving framework for large language models and vision language models.
  • vLLM - A high-throughput and memory-efficient inference and serving engine for LLMs.
  • TGI - a toolkit for deploying and serving Large Language Models (LLMs).
  • exllama - A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
  • llama.cpp - LLM inference in C/C++.
  • ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models.
  • Langfuse - Open Source LLM Engineering Platform 🪢 Tracing, Evaluations, Prompt Management, Evaluations and Playground.
  • FastChat - A distributed multi-model LLM serving system with web UI and OpenAI-compatible RESTful APIs.
  • mistral.rs - Blazingly fast LLM inference.
  • MindSQL - A python package for Txt-to-SQL with self hosting functionalities and RESTful APIs compatible with proprietary as well as open source LLM.
  • SkyPilot - Run LLMs and batch jobs on any cloud. Get maximum cost savings, highest GPU availability, and managed execution -- all with a simple interface.
  • Haystack - an open-source NLP framework that allows you to use LLMs and transformer-based models from Hugging Face, OpenAI and Cohere to interact with your own data.
  • Sidekick - Data integration platform for LLMs.
  • QA-Pilot - An interactive chat project that leverages Ollama/OpenAI/MistralAI LLMs for rapid understanding and navigation of GitHub code repository or compressed file resources.
  • Shell-Pilot - Interact with LLM using Ollama models(or openAI, mistralAI)via pure shell scripts on your Linux(or MacOS) system, enhancing intelligent system management without any dependencies.
  • LangChain - Building applications with LLMs through composability
  • Floom AI gateway and marketplace for developers, enables streamlined integration of AI features into products
  • Swiss Army Llama - Comprehensive set of tools for working with local LLMs for various tasks.
  • LiteChain - Lightweight alternative to LangChain for composing LLMs
  • magentic - Seamlessly integrate LLMs as Python functions
  • wechat-chatgpt - Use ChatGPT On Wechat via wechaty
  • promptfoo - Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.
  • Agenta - Easily build, version, evaluate and deploy your LLM-powered apps.
  • Serge - a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!
  • Langroid - Harness LLMs with Multi-Agent Programming
  • Embedchain - Framework to create ChatGPT like bots over your dataset.
  • Opik - Confidently evaluate, test, and ship LLM applications with a suite of observability tools to calibrate language model outputs across your dev and production lifecycle.
  • IntelliServer - simplifies the evaluation of LLMs by providing a unified microservice to access and test multiple AI models.
  • OpenLLM - Fine-tune, serve, deploy, and monitor any open-source LLMs in production. Used in production at BentoML for LLMs-based applications.
  • DeepSpeed-Mii - MII makes low-latency and high-throughput inference, similar to vLLM powered by DeepSpeed.
  • Text-Embeddings-Inference - Inference for text-embeddings in Rust, HFOIL Licence.
  • Infinity - Inference for text-embeddings in Python
  • TensorRT-LLM - Nvidia Framework for LLM Inference
  • FasterTransformer - NVIDIA Framework for LLM Inference(Transitioned to TensorRT-LLM)
  • Flash-Attention - A method designed to enhance the efficiency of Transformer models
  • Langchain-Chatchat - Formerly langchain-ChatGLM, local knowledge based LLM (like ChatGLM) QA app with langchain.
  • Search with Lepton - Build your own conversational search engine using less than 500 lines of code by LeptonAI.
  • Robocorp - Create, deploy and operate Actions using Python anywhere to enhance your AI agents and assistants. Batteries included with an extensive set of libraries, helpers and logging.
  • LMDeploy - A high-throughput and low-latency inference and serving framework for LLMs and VLs
  • Tune Studio - Playground for devs to finetune & deploy LLMs
  • LLocalSearch - Locally running websearch using LLM chains
  • AI Gateway — Gateway streamlines requests to 100+ open & closed source models with a unified API. It is also production-ready with support for caching, fallbacks, retries, timeouts, loadbalancing, and can be edge-deployed for minimum latency.
  • talkd.ai dialog - Simple API for deploying any RAG or LLM that you want adding plugins.
  • Wllama - WebAssembly binding for llama.cpp - Enabling in-browser LLM inference
  • GPUStack - An open-source GPU cluster manager for running LLMs
  • MNN-LLM -- A Device-Inference framework, including LLM Inference on device(Mobile Phone/PC/IOT)

LLM Applications

LLM Tutorials and Courses

LLM Books

Great thoughts about LLM

Miscellaneous

Contributing

This is an active repository and your contributions are always welcome!

I will keep some pull requests open if I'm not sure if they are awesome for LLM, you could vote for them by adding 👍 to them.


If you have any question about this opinionated list, do not hesitate to contact me chengxin1998@stu.pku.edu.cn.

[^1]: This is not legal advice. Please contact the original authors of the models for more information.