Tyler Ransom | |
---|---|
ransom@ou.edu | |
Office | 322 CCD1 |
Office Hours | M 9:30-10:30am, Th 12-1pm |
GitHub | tyleransom |
Data science is a rapidly developing field that combines the recent Big Data revolution with ever-developing statistical algorithms to inform business and policy decisions. Nearly every company you've heard of uses data science to optimize its services: Netflix uses it to recommend new programs to its viewers, Amazon uses it to determine how much it should charge for its Prime services. This class will provide students with an overview of the data science workflow, from collecting raw data to drawing a set of insights from which a decision maker can make informed decisions. Along the way we will broadly cover a variety of advances in data collection, data storage, visualization, machine learning and econometrics topics, as well as teaching and reinforcing good programming practices. The primary goal of this course is to provide you, the student, with a set of skills that will allow you to compete for a data science job.
By the end of the course, students should be able to do the following:
In this course students, through lecture and application, will learn about:
Grades will be based on the categories listed below with the corresponding weights.
Component | Percent |
---|---|
Class Participation | 10% |
Problem Sets | 35% |
Exam & Quizzes | 20% |
Final project | 35% |
Total points | 100% |
Final grades will be assigned according to the standard cutoffs (90%+ for an A, 80%-89.99% for a B, etc.).
Participation:
Problem sets: will be assigned approximately weekly throughout the semester.
Exam & Quizzes:
Final Project:
(Will be continuously updated throughout the semester)
Date | Day | Topic | Due |
---|---|---|---|
Jan 14 | T | What is data science / big data / why is it important? (Slides) | |
Jan 16 | Th | Git, GitHub, computing environment, and Coding best practices (Notes) and Slides by Grant McDermott | Read Gentzkow & Shapiro's handbook; Ch. 1 of The Master Algorithm; register for GitHub account |
Jan 21 | T | Linux command line (Grant McDermott's slides), SSH, accessing OSCER (Notes); Git Tutorial (p. 19 here; adding upstream repositories here) | PS 1 |
Jan 23 | Th | Overview of Data Scientists' tools (Notes) | |
Jan 28 | T | Using data: data types, storage (Notes) | PS 2 |
Jan 30 | Th | Big Data: SQL (Notes) & RDDs (link); running jobs on the OSCER cluster | |
Feb 4 | T | Sampling & storing Big Data (Notes) | PS 3 |
Feb 6 | Th | Web scraping/APIs to gather data (Notes; Grant McDermott's Lecture Notes; Ethics in Web Scraping; rvest demonstration slides at 2018 useR conference; tidyverse cheat sheet; Grant McDermott's Lecture Notes on R language basics) | |
Feb 11 | T | Web scraping/APIs to gather data (Notes); Grant McDermott's Lecture Notes | PS 4 |
Feb 13 | Th | Intro to Julia (Julia notes; Ivan Rudik's programming notes; Julia's "Learning Julia" page) | |
Feb 18 | T | ggplot2 (Basics; Kieran Healy's book) |
|
Feb 20 | Th | Getting to know your data: descriptive statistics, cleaning, tips, tricks, transformations, visualization (Notes; HTML slides) | PS 5 |
Feb 25 | T | Modeling continuous and discrete variables (Notes) HTML slides); Simple R script | |
Feb 27 | Th | Using JuMP to optimize cool stuff [Jupyter Notebook; Julia Code] (in previous years: Linear Algebra Introduction / Review (Handout)) | |
Mar 3 | T | Introduction to optimization (Notes) | PS 6 |
Mar 5 | Th | Writing and optimizing functions in R, Python, and Julia (Notes) | |
Mar 10 | T | Writing and optimizing functions in R, Python, and Julia (Notes) | PS 7 |
Mar 12 | Th | Debugging strategies and simulations (Notes) | |
Mar 17 | T | No class (Spring break) | |
Mar 19 | Th | No class (Spring break) | |
Mar 24 | T | Intro to Machine Learning (Notes) | PS 8 |
Mar 26 | Th | Supervised ML: Regularization, measuring model fit, tuning with cross-validation, the elastic net model (Notes) | |
Mar 31 | T | Supervised ML: The 5 Tribes of Machine Learning (Notes) | PS 9 |
Apr 2 | Th | Unsupervised ML: Clustering (Notes) | |
Apr 7 | T | Unsupervised ML: Dimensionality reduction and reinforcement learning (Notes) | PS 10 |
Apr 9 | Th | Machine learning vs. econometrics (Notes) | |
Apr 14 | T | Structural modeling: static discrete choice (Slides) | PS 11 |
Apr 16 | Th | Structural modeling: dynamic discrete choice (Slides) | |
Apr 21 | T | Structural modeling: dynamic discrete choice (Slides) | |
Apr 23 | Th | Final Project presentations (Rubric) | |
Apr 28 | T | Final Project presentations (Rubric) | |
Apr 30 | Th | Final Project presentations (Rubric) | PS 12 (optional) |
May 4 | Th | Final Exam (in class, 1:30-3:30pm) | Final project due (Scoresheet) |
It is the policy of the University to excuse the absences of students that result from religious observances and to reschedule examinations and additional required classwork that may fall on religious holidays, without penalty.
If a student requires an accommodation based on disability, the student should meet with me in my office during the first week of the semester. Student responsibility primarily rests with informing faculty at the beginning of the semester and in providing authorized documentation through designated administrative channels. The Disability Resource Center is located in the University Community Center at 730 College Avenue (405-325-3852).
I do not tolerate academic misconduct, and neither does the University of Oklahoma. I will not hesitate to fail students who do not fully comply with the University's academic misconduct policy. If you find yourself contemplating cheating, plagiarism, or other forms of academic misconduct, please come see me first. Help is available if you are struggling. I want everyone in the class to try their best and to do their own work. Please be advised that I reserve the right to utilize anti-plagiarism resources such as TurnItIn when grading assignments.
For any concerns regarding gender-based discrimination, sexual harassment, sexual assault, dating/domestic violence, or stalking, the University offers a variety of resources. To learn more or to report an incident, please contact the Sexual Misconduct Office at (405) 325-2215 (8 to 5, M-F) or smo@ou.edu. Incidents can also be reported confidentially to OU Advocates at (405) 615-0013 (phones are answered 24 hours a day, 7 days a week). Also, please be advised that a professor/GA/TA is required to report instances of sexual harassment, sexual assault, or discrimination to the Sexual Misconduct Office. Inquiries regarding non-discrimination policies may be directed to: Bobby J. Mason, University Equal Opportunity Officer and Title IX Coordinator at (405) 325-3546 or bjm@ou.edu. For more information, visit http://www.ou.edu/eoo.html.
Should you need modifications or adjustments to your course requirements because of documented pregnancy-related or childbirth-related issues, please contact your professor or the Disability Resource Center at (405) 325-3852 as soon as possible. Also, see http://www.ou.edu/eoo/faqs/pregnancy-faqs.html for answers to commonly asked questions.
If a student requires an accommodation based on disability, the student should meet with me in my office during the first week of the semester. Student responsibility primarily rests with informing faculty at the beginning of the semester and in providing authorized documentation through designated administrative channels. The Disability Resource Center is located in Goddard Hall (405-325-3852).