rstudio-conf-2020 / dl-keras-tf

rstudio::conf(2020) deep learning workshop
Creative Commons Attribution Share Alike 4.0 International
158 stars 82 forks source link
deep-learning keras-tensorflow machine-learning r workshop

Deep Learning with Keras and TensorFlow in R

rstudio::conf 2020

by Bradley Boehmke


:spiral_calendar: January 27 and 28, 2020
:alarm_clock: 09:00 - 17:00
:hotel: Ballroom Level, Imperial A
:writing_hand: rstd.io/conf


Overview

This two-day workshop introduces the essential concepts of building deep learning models with TensorFlow and Keras via R. Throughout this workshop you will gain an intuitive understanding of the architectures and engines that make up deep learning models, apply a variety of deep learning algorithms (i.e. MLPs, CNNs, RNNs, LSTMs, collaborative filtering), understand when and how to tune the various hyperparameters, and be able to interpret model results. You will have the opportunity to apply practical applications covering a variety of tasks such as computer vision, natural language processing, product recommendation and more. Leaving this workshop, you should have a firm grasp of deep learning and be able to implement a systematic approach for producing high quality modeling results.

Is this course for me?

Is this workshop for you? If you answer "yes" to these three questions, then this workshop is likely a good fit:

  1. Are you relatively new to the field of deep learning and neural networks but eager to learn? Or maybe you have applied a basic feedforward neural network but aren’t familiar with the other deep learning frameworks?

  2. Are you an experienced R user comfortable with the tidyverse, creating functions, and applying control (i.e. if, ifelse) and iteration (i.e. for, while) statements?

  3. Are you familiar with the machine learning process such as data splitting, feature engineering, resampling procedures (i.e. k-fold cross validation), hyperparameter tuning, and model validation?

This workshop will provide some review of these topics but coming in with some exposure will help you stay focused on the deep learning details rather than the general modeling procedure details.

Prework

I make a few assumptions of your established knowledge regarding your programming skills and machine learning familiarity (items #2-3 in the previous section). Below are my assumptions and some resources to read through to make sure you are properly prepared.

Assumptions Resource
You should be familiar with the Tidyverse, control flow, and writing functions R for Data Science
You should be familiar with the basic concept of machine learning Ch. 1 HOMLR
You should be familiar with the machine learning modeling process Ch. 2 HOMLR
You should be familiar with the feature engineering process Ch. 3 HOMLR

You will require several packages and datasets throughout this workshop. If you are attending the workshop these will be preinstalled for you so you do not need to worry about your OS differing from mine. However, after you leave the workshop, the first notebook below will allow you to reproduce the work you did in the workshop. Also, at the conference workshop, we will all use the RStudio Cloud platform. The second notebook below will get you set up so that we can hit the ground running on day 1!

Description Resource
Pre-installing necessary packages and datasets (already pre-installed for workshop!) Instructions
Source Code
Setting up RStudio Cloud environment Instructions

Schedule

This workshop is notebook-focused. Consequently, most of our time will be spent in R notebooks; however, I will also jump to slides to explain certain concepts in further detail. Throughout the notebooks, you will see ℹ️ icons that will hyperlink to relevant slides (or additional resources).

Day 1

Time Activity Notebook Source Code Other
09:00 - 09:30 Introduction Slides
09:30 - 10:30 Deep learning ingredients Notebook .Rmd Slides
10:30 - 11:00 Coffee break
11:00 - 12:30 Deep learning recipe
   Training your model Notebook .Rmd
   Mini-project: Predicting Ames, IA home sales prices Notebook .Rmd Solution
12:30 - 13:30 Lunch break
13:30 - 15:00 Computer vision & CNNs
   MNIST revisted Notebook .Rmd Slides
   Cats vs dogs Notebook .Rmd Slides
   Transfer learning Notebook .Rmd Slides
15:00 - 15:30 Coffee break
15:30 - 17:00 Project: Classifying natural images Notebook .Rmd Solution

Day 2

Time Activity Notebook Source Code Other
09:00 - 10:30 Word embeddings
   The original IMDB Notebook .Rmd Slides
   Pre-trained embeddings Notebook .Rmd Slides
   Mini project - Amazon reviews Notebook .Rmd Solution
10:30 - 11:00 Coffee break
11:00 - 12:30 Collaborative filtering Notebook .Rmd Excel file
12:30 - 13:30 Lunch break
13:30 - 15:00 RNNs & LSTMs
   IMDB revisted Notebook .Rmd Slides
   Mini project - Non-IMDB reviews Notebook .Rmd Solution
15:00 - 15:30 Coffee break
15:30 - 17:00 Wrap up
   Project: Detecting Duplicate Quora Questions Notebook .Rmd Solution
   Final words of wisdom Slides

Extras

Activity Notebook Source Code
Improving generalization with k-fold cross validation Notebook .Rmd
Performing a grid search Notebook .Rmd
Linear regression with stochastic gradient descent Notebook .Rmd
Diagnosing model performance with learning curves Notebook .Rmd
Save your models for later with serialization Notebook .Rmd
Visualizing what CNNs learn Notebook .Rmd

Instructor

Brad Boehmke is a Director of Data science at 84.51° where he wears both software developer and machine learning engineer hats. His team focuses on developing algorithmic processes, solutions, and tools that enable 84.51° and its data scientists to efficiently extract insights from data and provide solution alternatives to decision-makers. He is a visiting professor at the University of Cincinnati, author of the Hands-on Machine Learning with R and Data Wrangling with R books, creator of multiple public and private enterprise R packages, and developer of various data science educational content. You can learn more about his work, and connect with him, at bradleyboehmke.github.io.

TAs

Rick Scavetta

Omayma Said

Doug Ashton

Daniel Rodriguez


This work is licensed under a Creative Commons Attribution 4.0 International License.