Summary
I reformatted the notebook to make it easier to read cleaned up the code a bit to make it presentation-ready.
Feedback
Looks great! Everything is very thorough, including the justification for the hyperparameter tuning, and the visualizations are concinving. There are a few changes I highly recommend to make this more convincing + to make this easier to use.
The main thing I noticed is that the notebook narrows the number of features down to 3 and then derives 3 principal components from these features. Usually we would use PCA to derive a few simple features (components) that explain a lot of the variability in the data from many features. It is a form of dimensionality reduction, so we would not expect to perform it and get the same number of dimensions as a result. This is why (in your justification of how many components to use) we get the result that 3 principal components would explain 100% of the variability. This is true because we are not "simplifying" anything. I think this is definitely something that the grader would pick up on. My recommendations for this notebook:
[ ] Consider more features that are narrowed down to 3 PC's
[ ] Translate comments into English
[ ] Add README.md that explains the steps you had to take in the setup of your notebook (such that anyone browsing our github could read it and recreate the results of your notebook)
Feel free to add those changes here (in this branch), make an issue and assign me to some of those changes, or change any of the changes I included in this PR (by pulling from this branch, making changes, and pushing to it) :)
Summary I reformatted the notebook to make it easier to read cleaned up the code a bit to make it presentation-ready.
Feedback Looks great! Everything is very thorough, including the justification for the hyperparameter tuning, and the visualizations are concinving. There are a few changes I highly recommend to make this more convincing + to make this easier to use.
The main thing I noticed is that the notebook narrows the number of features down to 3 and then derives 3 principal components from these features. Usually we would use PCA to derive a few simple features (components) that explain a lot of the variability in the data from many features. It is a form of dimensionality reduction, so we would not expect to perform it and get the same number of dimensions as a result. This is why (in your justification of how many components to use) we get the result that 3 principal components would explain 100% of the variability. This is true because we are not "simplifying" anything. I think this is definitely something that the grader would pick up on. My recommendations for this notebook:
Feel free to add those changes here (in this branch), make an issue and assign me to some of those changes, or change any of the changes I included in this PR (by pulling from this branch, making changes, and pushing to it) :)