Introduction to machine learning

This module will occur on the third day and will serve as the students' crash course introduction to machine learning (ML) concepts.

Learning objective

Students will learn about what ML actually is (its definition), key concepts, and history. They will learn the difference between AI, ML and deep learning (which can be thought of as "combining" AI and ML). Furthermore, they will come away with a basic working understanding of supervised ML (particularly using basic algorithms, such as linear regression and decision trees), and the metrics used to evaluate different supervised ML algorithms.

Content to cover

What is ML?

Definition, key concepts.
History.
Difference between ML, AI and deep learning.
Contemporary ML: large language and foundation models.
Types of machine learning: supervised, unsupervised and reinforcement.

Supervised ML basics

Features (inputs) and labels (targets). The goal of ML is to map features to labels.
Two types: regression and classification. What's the difference, why does it matter?
How do we evaluate the performance of supervised ML algorithms? For regression, we generally use [continuous] loss functions like the mean squared error (root mean squared error is something students might be familiar with). For classification problems, the accuracy, precision, recall, F1 score, ROC curve analysis or direct analysis of the confusion matrix. No details here just a primer. Details will come in a later module.
How do we fit ML models (i.e., what is the optimizer)? Only a conceptual overview for now.
Training, testing and cross-validation.

Practical gradient-based supervised ML regression

Go over an example of linear regression using Scikit Learn. Demonstrate how it works except for the fitting procedure (i.e. we just want to show the plots of how the algorithm best fits the data, how we split the data, analyze the residuals, etc.).

Dissect gradient-based supervised ML regression

Introduction to gradient descent. Redo first example from scratch using custom code. Show how the model fit changes as gradient descent proceeds.

Logistic regression, classification problems

Introduction to classification, use of the sigmoid activation function and cross-entropy loss.
Discussion of how to fit the models using gradient descent.
Example using classification models (based only on logistic regression) on model problems.

Capstone

Similar to last module, students will pretend they are data scientists at a large corporation preparing a presentation to management. Using all of the skills they've currently developed over the past few days, they will perform a basic analysis on the California Housing Dataset, and train a simple ML model (some form of linear or logistic regression should be used) to predict the median house value target from the features provided. Students should note that these models will likely not perform well, and should analyze why this is the case.

matthewcarbone / Bootcamp