Open xihajun opened 1 year ago
Date | Description | Course Materials | Events | Deadlines |
---|---|---|---|---|
03/29 | Lecture 1: Introduction Computer vision overview Historical context Course logistics [slides 1] [slides 2] | |||
——— | Deep Learning Basics | |||
03/31 | Lecture 2: Image Classification with Linear Classifiers The data-driven approach K-nearest neighbor Linear Classifiers Algebraic / Visual / Geometric viewpoints SVM and Softmax loss [slides] | Image Classification Problem Linear Classification | ||
04/01 | Python / Numpy Review Session [Colab] [Tutorial] | 1:30-2:30pm PT | Assignment 1 out [handout] [colab] | |
04/05 | Lecture 3: Regularization and Optimization Regularization Stochastic Gradient Descent Momentum, AdaGrad, Adam Learning rate schedules [slides] | Optimization | ||
04/07 | Lecture 4: Neural Networks and Backpropagation Multi-layer Perceptron Backpropagation [slides] | Backprop Linear backprop example Suggested Readings: Why Momentum Really Works Derivatives notes Efficient backprop More backprop references: [1], [2], [3] | ||
04/08 | Backprop Review Session [slides] | 1:30-2:30pm PT | ||
——— | Perceiving and Understanding the Visual World | |||
04/12 | Lecture 5: Image Classification with CNNs History Higher-level representations, image features Convolution and pooling [slides] | Convolutional Networks | ||
04/13 | Final Project Overview and Guidelines [slides] | 3:00-4:00pm PT | ||
04/14 | Lecture 6: CNN Architectures Batch Normalization Transfer learning AlexNet, VGG, GoogLeNet, ResNet [slides] | AlexNet, VGGNet, GoogLeNet, ResNet | ||
04/15 | Assignment 1 due | |||
04/18 | Project proposal due | |||
04/19 | Lecture 7: Training Neural Networks Activation functions Data processing Weight initialization Hyperparameter tuning Data augmentation [slides] | Neural Networks, Parts 1, 2, 3 Suggested Readings: Stochastic Gradient Descent Tricks Efficient Backprop Practical Recommendations for Gradient-based Training Deep Learning, Nature 2015 An Overview of Gradient Descent Algorithms A Disciplined Approach to Neural Network Hyper-Parameters | ||
04/21 | Lecture 8: Visualizing and Understanding Feature visualization and inversion Adversarial examples DeepDream and style transfer [slides] | |||
04/22 | PyTorch Review Session [slides] | 1:30-2:30pm PT | ||
04/26 | Lecture 9: Object Detection and Image Segmentation Single-stage detectors Two-stage detectors Semantic/Instance/Panoptic segmentation [slides] | FCN, R-CNN, Fast R-CNN, Faster R-CNN, YOLO | ||
04/28 | Lecture 10: Recurrent Neural Networks RNN, LSTM, GRU Language modeling Image captioning Sequence-to-sequence [slides] | Suggested Readings: DL book RNN chapter Understanding LSTM Networks | ||
04/29 | Object Detection & RNNs Review Session [slides] | 2:30-3:30pm PT | ||
05/02 | Assignment 2 due | |||
05/03 | Lecture 11: Attention and Transformers Self-Attention Transformers [slides] | Suggested Readings: Attention is All You Need [Original Transformers Paper] Attention? Attention [Blog by Lilian Weng] The Illustrated Transformer [Blog by Jay Alammar] ViT: Transformers for Image Recognition [Paper] [Blog] [Video] DETR: End-to-End Object Detection with Transformers [Paper] [Blog] [Video] | ||
05/5 | Lecture 12: Video Understanding Video classification 3D CNNs Two-stream networks Multimodal video understanding [slides] | |||
05/06 | Midterm Review Session | 2:30-3:30pm PT | ||
05/07 | Project milestone due | |||
05/10 | In-Class Midterm | 1:30-3:00pm | Assignment 3 out [handout] [colab] | |
——— | Reconstructing and Interacting with the Visual World | |||
05/12 | Lecture 13: Generative Models Supervised vs. Unsupervised learning Pixel RNN, Pixel CNN Variational Autoencoders Generative Adversarial Networks [slides] | Suggested Readings: Image GPT: Generative Pretraining From Pixels [Paper] [Blog] | ||
05/17 | Lecture 14: Self-supervised Learning Pretext tasks Contrastive learning Multisensory supervision [slides] | Suggested Readings: Lilian Weng Blog Post DINO: Emerging Properties in Self-Supervised Vision Transformers [Paper] [Blog] [Video] | ||
05/19 | Lecture 15: Low-Level Vision (Guest Lecture by Prof. Jia Deng from Princeton University) Optical flow Depth estimation Stereo vision [slides] | Assignment 3 due | ||
——— | Human-Centered Applications and Implications | |||
05/26 | Lecture 17: Human-Centered Artificial Intelligence AI & healthcare | |||
05/31 | Lecture 18: Fairness in Visual Recognition (Guest Lecture by Prof. Olga Russakovsky from Princeton University) | |||
06/02 | Project final report due | |||
06/04 | Final Project Poster Session Note: Only open to the Stanford community and invited guests. 3:30-6:30pm Location: Alumni Center McCaw Hall/Ford Gardens Click here for the logistics and expectations. | |||
06/05 | Project poster PDF due |