dssg / hitchhikers-guide

The Hitchhiker's Guide to Data Science for Social Good
986 stars 279 forks source link
data-science dssg machine-learning training tutorial-exercises

Welcome to the Hitchhiker's Guide to Data Science for Social Good.

What is the Data Science for Social Good Fellowship?

The Data Science for Social Good Fellowship (DSSG) is a hands-on and project-based summer program that launched in 2013 at the University of Chicago and has now expanded to multiple locations globally and currently coordinated by the Data Science for Social Good Foundation and Carnegie Mellon University. It brings a group of fellows, typically graduate students (or senior undergraduate students in some cases), from across the world to work on machine learning, artificial intelligence, and data science projects that have a social impact in partnership with social good organizations. From a pool of typically around 1000 applicants, 20-40 fellows are selected from diverse computational and quantitative disciplines including computer science, statistics, math, engineering, psychology, sociology, economics, and public policy.

The fellows work in small, cross-disciplinary teams on social good projects spanning education, health, energy, transportation, criminal justice, social services, economic development and international development in collaboration with global government agencies and non-profits. This work is done under close and hands-on mentorship from full-time, dedicated, senior data science mentors as well as dedicated project managers, with industry and/or government experience. The result is highly trained fellows, improved data science capacity of the social good organization, and a high quality data science project that is ready for field trial and implementation at the end of the program.

In addition to hands-on project-based training, the summer program also consists of workshops, tutorials, and ethics discussion groups based on our data science for social good curriculum designed to train the fellows in doing practical data science and artificial intelligence for social impact.

Who is this guide for?

The primary audience for this guide is the set of fellows coming to DSSG but we want everything we create to be open and accessible to larger world. We hope this is useful to people beyond the summer fellows coming to DSSG.

If you are applying to the program or have been accepted as a fellow, check out the manual to see how you can prepare before arriving, what orientation and training will cover, and what to expect from the summer.

If you are interested in learning at home, check out the tutorials and teach-outs developed by our staff and fellows throughout the summer, and to suggest or contribute additional resources.

*Another one of our goals is to encourage collaborations. Anyone interested in doing this type of work, or starting a DSSG program, to build on what we've learned by using and contributing to these resources.

What is in this guide?

Our number one priority at DSSG is to train fellows to do responsible data science/ML/AI for social good work. This curriculum includes many things you'd find in a data science course or bootcamp, but with an emphasis on solving problems with social impact, integrating data science with the social sciences, understanding and discussing ethical implications of the work, as well as privacy, and confidentiality issues.

We have spent many (sort of) early mornings waxing existential over Dunkin' Donuts while trying to define what makes a "data scientist for social good," that enigmatic breed combining one part data scientist, one part helper, one part educator, and one part bleeding heart idealist. We've come to a rough working definition in the form of the skills and knowledge one would need, which we categorize as follows:

All material is licensed under CC-BY 4.0 License: CC BY 4.0

Table of Contents

The links below will help you find things quickly.

DSSG Manual

Summer Overview

This sections covers general information on projects, working with partners, presentations, orientation information, and the following schedules:

Conduct, Culture, and Communications

This section details the DSSG anti-harassment policy, goals of the fellowship, what we hope fellows get out of the experience, the expectations of the fellows, and the DSSG environment. A slideshow version of this can also be found here.

Curriculum

This section details the various topics we will be covering throughout the summer. This includes:

Wiki

In the wiki, you will find a bunch of helpful information and instructions that people have found helpful along the way. It covers topics like:

Contributing

This guide is compiled through mkdocs and served with github pages. When updating them, you can serve them locally to test your changes via (from the top level of this repo):

mkdocs serve -f "$(pwd)/mkdocs.yml"

Once you're ready to publish them, you can do so with:

mkdocs gh-deploy -f "$(pwd)/mkdocs.yml"

(Note that a bug in the version of mkdocs we currently use requires specifying the full path to the configuration file, hence the $(pwd) in the command -- we should be able to remove this in the future if we update the dependency)