Will want to start building an end-to-end ML pipeline

We will want to start building an end-to-end ML pipeline for predict user quality including:

a feature extraction method
using scikit-learn for auto-feature selection
using scikit-learn for parameter tuning (e.g., GridSearchCV
using scikit-learn for the actual prediction... can do two types of predictions: a regression that predicts accuracy given the input features and a classification of "good" vs. "bad" (for this, we could use lots of different models but let's start with an SVM).

Here's an example simple end-to-end ML pipeline for 3D gesture classification using an accelerometer that I built for my PhD course. This was a code skeleton (so does not use the best input features but definitely shows you how to create a full end-to-end classifier with scikit-learn):

https://github.com/jonfroehlich/CSE599Sp2019/blob/master/Assignments/A4-SVMGestureRecognizer/SVMGestureRecognizer.ipynb
You need to have this folder ('GestureLogs') in the root dir of this notebook GestureLogs.zip

To start, I think we can just do an 80/20 train-to-test split of the data. We will be getting more users and interaction logs as the project continues (and we get more validations).

ProjectSidewalk / sidewalk-quality-analysis

Will want to start building an end-to-end ML pipeline #15