CS 25 - Notes

Stanford CS 25 | Transformers United 课程笔记，「首个Transformers专题讲座，NLP、CV和RL无所不包」

这门课程是斯坦福大学的 CS 课程的一门前沿课程：《CS 25: Transformers United》。

这门课程的重点就是介绍 Transformers，并统一其在 ML、CV、NLP、生物学和其他社区的使用。此外，该课程还讨论关于 Transformer 的最新突破和想法，以激发交叉合作研究。

CS 25 课程邀请了来自不同领域关于 Transformer 研究的前沿人士进行客座讲座。有 AI 教父 Geoff Hinton；OpenAI 的研究科学家 Mark Chen，主要介绍基于 Transformers 的 GPT-3、Codex；Google Brain 的科学家 Lucas Beyer，主要介绍 Transformer 在视觉领域的应用；Meta FAIR 科学家 Aditya Grover，主要介绍 RL 中的 Transformer 以及计算引擎等。

立项理由

期望通过这个项目，使得更多的小伙伴能够了解到这门课程，以及能够更好的学习这门课程。基于好的知识应该得到更为广泛的传播，所以我们参与的小伙伴对于课程的整体愿景有如下几点：

1）让更多的人了解到CS25这门如此多元化的一门课程（内容的质量高）

2）更提供更低的门槛让更多的人学习这门课程（内容注解是要清晰的）

3）能在课程笔记的注解中提供更多的思考（有些内容重要的不只是知识本身，还有知识背后的思想）

项目受众

由于这门课程的重点就是介绍 Transformers，并统一其在 ML、CV、NLP、生物学和其他社区的使用。此外，该课程还讨论关于 Transformer 的最新突破和想法，以激发交叉合作研究。

想要学习这门课程的小伙伴，必须先要掌握深度学习基础知识（必须理解注意力机制），或者已经通过 CS224N / CS231N / CS230 课程。

所以这门课的受众是：对于深度学习方向有基本的了解，并期望对于相关Transformers方面的研究有更多了解的同学。

项目亮点

Transformers 和各方向交叉应用研究的前沿专题讲座课程，大佬云集。

项目规划

1.目录（如有多级至少精确到二级）整体计划包含CS25 V1-V4的所有内容（V4-ing中）。

Title（V1）

Introduction to Transformers（同V2 第一课） Transformers in Language: GPT-3, CodexSpeaker: Mark Chen (OpenAI) Applications in VisionSpeaker: Lucas Beyer (Google Brain) Transformers in RL & UniversalCompute EnginesSpeaker: Aditya Grover (FAIR) Scaling transformersSpeaker: Barret Zoph (Google Brain)with Irwan Bello and Liam Fedus Perceiver: Arbitrary IO with transformersSpeaker: Andrew Jaegle (DeepMind) Self Attention & Non-Parametric TransformersSpeaker: Aidan Gomez (University of Oxford) GLOM: Representing part-whole hierarchies in a neural networkSpeaker: Geoffrey Hinton (UoT) Interpretability with transformersSpeaker: Chris Olah (AnthropicAI) Transformers for Applications in Audio, Speech and Music: From Language Modeling to Understanding to Synthesis. Speaker: Prateek Verma (Stanford)

Title（V2）

Introduction to Transformers（同V1 第一课）Speaker: Andrej Karpathy Language and Human AlignmentSpeaker: Jan Leike (OpenAI) Emergent Abilities and Scaling in LLMsSpeaker: Jason Wei (Google Brain) Strategic GamesSpeaker: Noam Brown (FAIR) Robotics and Imitation LearningSpeaker: Ted Xiao (Google Brain) Common Sense ReasoningSpeaker: Yejin Choi (U. Washington / Allen Institute for AI) Biomedical TransformersSpeaker: Vivek Natarajan (Google Health AI) In-Context Learning & Faithful ReasoningSpeakers: Stephanie Chan (DeepMind) & Antonia Creswell (DeepMind) Neuroscience-Inspired Artificial IntelligenceSpeakers: Trenton Bricken (Harvard/Redwood Center for Theoretical Neuroscience/Anthropic) & Will Dorrell (UCL Gatsby Computational Neuroscience Unit/Stanford)

Title（V3）

Llama 2: Open Foundation and Fine-Tuned Chat ModelsSpeaker: Sharan Narang, Meta AI Low-level Embodied Intelligence with Foundation ModelsSpeaker: Fei Xia, Google Deepmind Generalist Agents in Open-Ended WorldsSpeaker: Jim Fan, NVIDIA AI Recipe for Training Helpful ChatbotsSpeaker: Nazneen Rajani, HuggingFace How I Learned to Stop Worrying and Love the TransformerSpeaker: Ashish Vaswani No Language Left Behind: Scaling Human-Centered Machine TranslationSpeaker: Angela Fan, Meta AI Going Beyond LLMs: Agents, Emergent Abilities, Intermediate-Guided Reasoning, BabyLMSpeaker: Instructors Retrieval Augmented Language ModelsSpeaker: Douwe Kiela, Contextual AI

Title（V4）

Instructor Lecture: Overview of Transformers [In-Person]Speakers: Steven Feng, Div Garg, Emily Bunnapradist, Seonghee LeeSlides posted here. Intuitions on Language Models (Jason) [In-Person]How did we end up here? Early history and evolution of Transformer (Hyung Won) [In-Person]Speakers: Jason Wei & Hyung Won Chung, OpenAI TBDSpeaker: Nathan Lambert, Allen Institute for AI (AI2) Demystifying Mixtral of Experts [Virtual/Zoom]Speaker: Albert Jiang, Mistral AI / University of Cambridge Developing precision language models from self-attentive feed-forward units, and applying them in edge computing scenarios as untrained language models prompted to predict symbolic switches (U-LaMPS)Speaker: Jake Williams, Drexel University

2.各章节负责人

未完全确定

3.各章节预估完成日期

整体内容在6月底之前完成，各章节同步推进。

4.可预见的困难

1）整体内容难度较高，有些专题讲解的深度较深，需要相关方面的良好基础才能比较好的总结专题讲座内容。「内容完成后逐步迭代」 2）内容进度滞后「做好节点控制」

项目负责人

GitHub: https://github.com/mlw67 WeChat: mltheory

备注：发起立项申请后DOPMC成员将会在7天内给出审核意见，若7天内无反对意见则默认立项通过~

[X] 我已知悉上述备注

datawhalechina / DOPMC

cs25-notes #236

你是否已经阅读并同意《Datawhale开源项目指南》？

你是否已经阅读并同意《Datawhale开源项目行为准则》？

项目简介