UofT-DSI / applying_statistical_concepts

MIT License
8 stars 263 forks source link

Applying Statistical Concepts: Linear regression, classification, and resampling

Content

Description

This module introduces the skills required to design, implement, and test basic statistical learning methods, including regression, classification, and clustering, as well as validating models with resampling techniques. It compares the differences between modeling for prediction purposes and inference, exploring the trade-offs between prediction accuracy, model interpretability, and the bias-variance trade-off. Participants also gain exposure to key tools such as Pandas, NumPy, and scikit-learn.

Learning Outcomes

By the end of the module, participants will be able to:

Assignments

Participants should review the Assignment Submission Guide for instructions on how to complete assignments in this module.

Assignment 1

Assignment 2

Assignment 3

Assignment Due-dates

Assessment Content Due Date
Assignment 1 Classification (Sessions 1, 2) Sep 29
Assignment 2 Regression (Sessions 3, 4) Oct 6
Assignment 3 Clustering & Resampling (Sessions 5, 6) Oct 13

Contacts

Questions can be submitted to the #cohort-4-help channel on Slack

Delivery of the Learning Module

This module will include live learning sessions and optional, asynchronous work periods. During live learning sessions, the Technical Facilitator will introduce and explain key concepts and demonstrate core skills. Learning is facilitated during this time. Before and after each live learning session, the instructional team will be available for questions related to the core concepts of the module. Optional work periods are to be used to seek help from peers, the Learning Support team, and to work through the homework and assignments in the learning module, with access to live help. Content is not facilitated, but rather this time should be driven by participants. We encourage participants to come to these work periods with questions and problems to work through.   Participants are encouraged to engage actively during the learning module. They key to developing the core skills in each learning module is through practice. The more participants engage in coding along with the instructional team, and applying the skills in each module, the more likely it is that these skills will solidify.

The technical facilitator will introduce the concepts through a collaborative live coding session using the Python notebooks found under /01_materials/notebooks/. Slides can be found under /01_materials/slides/.

Schedule

Requirements

Resources

Feel free to use the following as resources:

Documents

Videos

Simple Linear Regression

Multiple linear regression, interactions, qualitative predictors

Classification (logistic regression, generative models)