ossu / data-science

📊 Path to a free self-taught education in Data Science!
Other
19.12k stars 3.41k forks source link
Open Source Society logo

Open Source Society University

:bar_chart: Path to a free self-taught education in Data Science!

Open Source Society University - Data Science

Contents

About

This is a path for those of you who want to complete the Data Science undergraduate curriculum on your own time, for free, with courses from the best universities in the World.

In our curriculum, we give preference to MOOC (Massive Open Online Course) style courses because these courses were created with our style of learning in mind.

Curricular Guideline

OSSU Data Science uses the report Curriculum Guidelines for Undergraduate Programs in Data Science as our guide for course recommendation.

How to use this guide

Duration

It is possible to finish within about 2 years if you plan carefully and devote roughly 20 hours/week to your studies. Learners can use this spreadsheet to estimate their end date. Make a copy and input your start date and expected hours per week in the Timeline sheet. As you work through courses you can enter your actual course completion dates in the Curriculum Data sheet and get updated completion estimates.

Warning: While the spreadsheet is a useful tool to estimate the time you need to complete this curriculum, it may not be up-to-date with the curriculum. Use the spreadsheet just to estimate the time you need. Use the the GitHub repo to see what courses to do.

Order of the classes

Some courses can be taken in parallel, while others must be taken sequentially. All of the courses within a topic should be taken in the order listed in the curriculum. The graph below demonstrates how topics should be ordered.

Topic Progression Graph

Track your progress

Fork the GitHub repo into your own GitHub account and put ✅ next to the stuff you've completed as you complete it. This can serve as your kanban board and will be faster to implement than any other solution (giving you time to spend on the courses).

Which programming languages should I use?

Python and R are heavily used in Data Science community and our courses teach you both. Remember, the important thing for each course is to internalize the core concepts and to be able to use them with whatever tool (programming language) that you wish.

Content Policy

You must share only files that you are allowed. Do NOT disrespect the code of conduct that you sign in the beginning of your courses.

Community

We have a Discord server! This should be your first stop to talk with other OSSU students. Why don't you introduce yourself right now?

You can also interact through GitHub issues.

Add Open Source Society University to your Linkedin profile!

Warning: There are a few third-party/deprecated/outdated material that you might find when searching for OSSU. We recommend you to ignore them, and only use the OSSU Data Science Github Repo. Some known outdated materials are:

  • An unmaintained and deprecated trello board
  • Third-party notion templates

Prerequisites

The Data Science curriculum assumes the student has taken high school math and statistics.

Curriculum

Introduction to Data Science

What is Data Science

Introduction to Computer Science

Students who already know basic programming in any language can skip this first course

Introduction to programming

Introduction to Computer Science and Programming Using Python

Introduction to Computational Thinking and Data Science

Data Structures and Algorithms

The Algorithms courses are taught in Java. If students need to learn Java, they should take this course first

Java Programming

Algorithms I: ArrayLists, LinkedLists, Stacks and Queues

Algorithms II: Binary Trees, Heaps, SkipLists and HashMaps

Algorithms III: AVL and 2-4 Trees, Divide and Conquer Algorithms

Algorithms IV: Pattern Matching, Dijkstra’s, MST, and Dynamic Programming Algorithms

Databases

Database Management Essentials

Data Warehouse Concepts, Design, and Data Integration

Relational Database Support for Data Warehouses

Business Intelligence Concepts, Tools, and Applications

Design and Build a Data Warehouse for Business Intelligence Implementation

MongoDB for Developers Learning Path

Single Variable Calculus

Calculus 1A: Differentiation

Calculus 1B: Integration

Calculus 1C: Coordinate Systems & Infinite Series

Linear Algebra

Essence of Linear Algebra

Linear Algebra

Multivariable Calculus

Multivariable Calculus

Statistics & Probability

Introduction to Probability

Intro to Descriptive Statistics

Intro to Inferential Statistics

Statistical Learning with Python by Stanford University on EdX (Textbook, Textbook resources) or Statistical Learning With R by Stanford University on EdX (Textbook, Textbook resources)

Data Science Tools & Methods

Tools for Data Science

Data Science Methodology

Data Science: Wrangling

Machine Learning/Data Mining

Supervised Machine Learning: Regression and Classification

Advanced Learning Algorithms

Unsupervised Learning, Recommenders, Reinforcement Learning

Intro to Machine Learning

Mining Massive Datasets

Process Mining

Final project

Part of learning is doing. The assignments and exams for each course are to prepare you to use your knowledge to solve real-world problems.

After you've completed the curriculum, you should identify a problem that you can solve using the knowledge you've acquired. You can create something entirely new, or you can improve some tool/program that you use and wish were better.

Students who would like more guidance in creating a project may choose to use a series of project oriented courses. A sample of options (many more are available, at this point you should be capable of identifying a series that is interesting and relevant to you) are available on this page.

Congratulations

After completing the requirements of the curriculum above, you will have completed the equivalent of a full bachelor's degree in Data Science. Congratulations!

What is next for you? The possibilities are boundless and overlapping:

keep learning

How to contribute

You can open an issue and give us your suggestions as to how we can improve this guide, or what we can do to improve the learning experience.

You can also fork this project and send a pull request to fix any mistakes that you have found.

If you want to suggest a new resource, send a pull request adding such resource to the extras section. The extras section is a place where all of us will be able to submit interesting additional articles, books, courses and specializations.

Code of Conduct

OSSU's code of conduct.

Team