Data Analysis at Scale in the Cloud
Course taught at Duke MIDS, Spring 2020-2022 by Noah Gift.
Guest Lecture 2022-Async
GPT 3:
Prequel Material
These resources could be helpful before starting this course.
Duke/Coursera: Foundations of Data Engineering Course (Launching early 2022)
Course1: Python and Pandas for Data Engineering
Course2: Linux and Bash for Data Engineering
Github Repos for Projects in Course
Week1: Using Linux
Week2: Using Bash
Week3: Building Bash Scripts
Week4: Composing File and Data Management Solutions with Linux
Course3: Python and SQL for Data Engineering
Course4: Building Data Engineering Solutions with Python for Web Applications, Command-Line Tools and Notebooks
Sequel Material
These resources could be helpful after starting this course.
Duke/Coursera: Applied Data Engineering Course (Launching late 2022)
Github Repos Referenced Duke Coursera Course
Course 1: Cloud Computing Foundations
Course 2: Cloud Computing Building Blocks
Lecture Topics:
Getting Started: [Week1]
Cloud Computing Foundations: [Week2]
Virtualization and Containers: [Week3 & Week 4]
Challenges and Opportunities in Distributed Computing: [Week 5 & Week 6]
Cloud Storage [Week 7 & Week 8]
Serverless [Week 9 & Week 10]
MLOps, Big Data and Edge Computer Vision [Week 11 & Week 12 & Week 13]
General
Student Example Projects
A practical guide to Data Science, Machine Learning Engineering and Data Engineering
Read Cloud Computing for Data Book
Free book Developing-on-AWS-with-CSharp
Next Steps: Take Coursera MLOps Course
Text and Code License
The text and code content of notebooks and documents is released under the CC-BY-NC-ND license