OpenSourceEcon / BootCamp2019

Repository of syllabi, lecture notes, Jupyter notebooks, code, and problem sets for OSE Lab Boot Camp 2019
88 stars 97 forks source link

Machine Learning material #14

Open jeannesorin opened 5 years ago

jeannesorin commented 5 years ago

What material do you have on machine learning?

Xincheng-Qiu commented 5 years ago

The 2018 AEA Continuing Education Webcasts on Machine Learning and Econometrics by Susan Athey and Guido Imbens might be a place to start~ https://www.aeaweb.org/conference/cont-ed/2018-webcasts

rickecon commented 5 years ago

Great reference above to the Susan Athey material. I have taught intro to machine learning for economists every year in the M.A. Program in Computational Economics at the University of Chicago. My most recent Jupyter notebooks are the persp-model-econ_W19/Notebooks/ directory of my modeling course repository. You might just want to fork and clone the whole repo. There are some nice problem sets in this repo. I have listed the machine learning notebooks below with a short description. But I will be updating these notebooks and covering them in the last week of the boot camp. In particular, I will reference some of the content covered by John Rust, Sanjog Misra, and Whitney Newey in the DSE Summer School today and yesterday.

  1. Classification notebook. This notebook goes through logit, multinomial logit, and K Nearest Neighbors.
  2. Resampling methods. How to implement the paradigm of cross validation through k-folds and bootstrapping of estimating your model multiple times on training sets and measuring accuracy on test set. This is one of the most important contributions of the machine learning advances in economics, almost as important as the new models.
  3. Tree-based methods. This notebook goes through decision trees and random forest methods. These end up being powerful machine learning methods that sometimes beat neural nets in accuracy. Also, this notebook introduces the concept of tuning hyperparameters to maximize test set accuracy. This is a powerful concept that is the basis for what TensorFlow is good at.
  4. Support vector machines. SVM is an important classifier.
  5. Neural nets. This is an introduction to neural networks. The notebook takes you through the multi-layer perceptron (MLP) model that includes hidden layer (deep nets).